3U DDS (URUKUL) discussion

gkasprow commented 7 years ago

I did early estimation of available board area. It seems that we can easily fit 4 channels on 100x160 Eurocard. There are several questions which need to be answered before we continue with schematics:

I assume we don't need clock shaping circuit in AD9912 and we will route DAC output (pins 50,51) directly to the circuit copied from Allaki.
do we need to measure output level as it is done in Allaki design? Should I remove AD8363 or add some I2C/SPI ADC ?
we need to control several IO lines per channel. Which one can be common and which can be common? I assume that DDS Reset, S4...S1, IO_update, CLKMODESEL can be common while PWRDOWN should be individual. For output channel switch, we would need individual RFSW control. Attenuator can have common LE, CLK and RSTn.
How we distribute SYSCLK? I assume we have SMA input on front panel and some other connector (i.e. mmcx) on the other side of PCB to route via jumpers to Kasli clock distribution. I assume we will use 1:4 clock splitter and input circuit as it is on Sayma RTM.
I copy loop filter from devkit
What about SPI control? I can connect 4x HMC542 to have 32bit data and one CS. I can also connect them as 2x 16bit but additional CS will be needed.
the same I can do with SPI registers (SN74LV595) to have 32it data.
I will use 8 output address decoder ('138) to connect to all CSn lines. We can route additional CSn or treat state 000 as none SPI selected. SPI address decoder map: 000 - no slave selected 001 - DDS1 010 - DDS2 011 - DDS3 100 - DDS4 101 - attenuators CS 110 - serial register CS 111 - reserved
LVDS assignment: LVDS1: SCKI LVDS2: SDI LVDS3: SDO LVDS4: SEL CSN3 LVDS5: SEL CSN2 LVDS6: SEL CSN1 LVDS7: IO_UPDATE ?? LVDS8: RESET ??
for supply I want to use similar double output converter modules as for Sayma RTM + LDOs regulating down to +5V and 3.3/1.8V
do we plan to compensate for clock delays? We can use one of clock outputs and route it back to Kasli phase detector.
below is initial component placement on PCB. Each channel will have individual shield.

obraz

hartytp commented 7 years ago

@gkasprow Looks nice!

@jordens How do you want to manage the review process for Urukul+ Kasli? Do you want feedback from us at this stage? I can guess the answers to most of these questions, but I won't add my tuppence worth unless it's wanted!

hartytp commented 7 years ago

Can you remind me what the plan for clock distribution is? Is the idea that we'll distribute a ~100MHz clock to Kasli + Urukul (via SMAs/backplane/CDR from DRTIO) and then use local PLLs to generate the DAC clock? Will you use the PLL inside the DAC, or add an external IC (HMC830 or similar)?

gkasprow commented 7 years ago

At the moment the clock can be taken either from front panel SMA or Kasli clock distribution. I assume we use the PLL in DDS chip.

dhslichter commented 7 years ago

While we are in the early stages: we might consider using AD9914 instead of AD9912, for two reasons:

profile mode
potential for higher clock frequency (thus higher output frequencies).

Particularly for DDS chips which are programmed over serial, profile mode is very handy because it can allow fast frequency/phase/amplitude hopping with deterministic timing at faster timescales than one can reprogram everything over SPI. We use this methodology for our microwave generation in the magtrap. The user does not need to use profile mode if they don't need/want it, but it can be handy e.g. for hopping around the hyperfine manifold.

Also, a 1 GHz clock on the DDS limits us to ~400 MHz (assuming we put reconstruction filters etc on the Urukul board). If you go with the AD9914, you can run at e.g. 2 GHz clock frequency, which lets you go up to ~800 MHz output frequencies, which is handy for some groups that drive 600 MHz AOMs, for example. We run ours at 2.4 GHz.

I would use an external PLL for the reference clock generation (e.g. HMC830, something already in Sayma so the gateware will be written) because I think they tend to be better/more flexible than the internal DDS PLLs.

sbourdeauducq commented 7 years ago

Third reason: we already know the main silicon bugs of the 9914, but not those of the 9912.

sbourdeauducq commented 7 years ago

That being said, if someone from ADI can confirm that the 9912 is bug-free, I would prefer it.

gkasprow commented 7 years ago

Take into account that ad9914 cost is 188$ while AD9912 is 61$. This is the most expensive part on the board. It consumes 3W of power while AD9914 consumes about 1.3W. 12W (4 channels) of power dissipation just for DDS chips is quite a lot. I have devkit with 9912 so can test some functionalities.

dhslichter commented 7 years ago

This cost difference is not zero, but given the additional functionality I'd argue it's definitely worth it. $120 per channel is a very small cost in the types of setups these boards would be used in. The power dissipation is definitely an issue to be considered; however, with forced air this can certainly be handled (and this is in fact what we do with our hardware).

The decision that needs to be made here is basically about how capable we want/need Urukul to be. Is the target market just to do simple CW signal generation for AOMs? Or are people going to use these to generate signals for pulsed setups (with some modulation or switch afterward)? What kind of flexibility would be desired?

From my standpoint, the AD9914 gives the option for substantially broader use cases, at a pretty low cost ($500/board extra, assuming a 4-channel board), and with no additional complexity on the gateware/programming side.

One of the main issues I see is that people may want a solution which can do pulsed operations, or fast frequency/phase switching, but may not be ready to commit to the full Sayma system in the crate (which is very large and expensive compared with the Kasli/EEM ecosystem). By changing the Urukul to the AD9914, one would be able to bridge this functionality gap more readily, and provide a more "mid-grade" performance solution.

gkasprow commented 7 years ago

For me the only difference at the moment is that I already purchased devkit with 9912. To do quick tests with 9914 I would need another devkit for 700$. They have different interfaces so cannot connect my board with 9914 to 9912 devkit.

hartytp commented 7 years ago

@dhslichter Particularly for DDS chips which are programmed over serial, profile mode is very handy because it can allow fast frequency/phase/amplitude hopping with deterministic timing at faster timescales than one can reprogram everything over SPI. We use this methodology for our microwave generation in the magtrap. The user does not need to use profile mode if they don't need/want it, but it can be handy e.g. for hopping around the hyperfine manifold.

How do you want to connect the profile pins to Kasli? Bearing in mind that the IDC only has 8 LVDS pairs to control all 4 DDS channels, we'd need to use multiplexed IO. Depending on how we do that, it may not end up being that much faster than reprogramming the DDS over SPI.

12W (4 channels) of power dissipation just for DDS chips is quite a lot.

Currently, we're budgeting for 6W (0.5A of 12V) per EEM. If we go for the AD9914, we should probably increase that to more like 1.5A per EEM. That would mean increasing the current rating of the power connectors on Kasli/VHDCI carrier (currently 5A IIRC).

dtcallcock commented 7 years ago

How about giving users the option of using two IDCs? In that case the timing-critical profile selects and RF switches would have dedicated lines. It would also mean that you only have to draw 0.75A per IDC.

Even in the AD9912 case, it's not clear how the RF switches are controlled.

jordens commented 7 years ago

@gkasprow Yes. Output circuitry is allaki directly. No output power sensing. No ad8363 etc. AA filtering (and Nyquist zone selection) to be done with the allaki filters. DDS reset, io update are common, s1-4 and clkmodesel are common and can be resistor jumpers. Pwrdown can be individual resistor jumpers. RF switches should be an individual eem line (maybe an addressable latch with the DDS cs?). Attenuators can be daisy chained with a single cs. More later...

hartytp commented 7 years ago

@jordens +1 to all of that.

Also FWIW, we'd much prefer the AD9912 to the AD9914. We would also really like this design to run from a single IDC.

hartytp commented 7 years ago

@gkasprow When you produce the prototypes for this board please can we buy 5? We'd like these asap, and don't mind using prototype hardware...

gkasprow commented 7 years ago

@hartytp So we will make at least 6 prototypes of DDS? One stays at WUT, others go abroad. Anybody else wants first prototypes?

hartytp commented 7 years ago

@gkasprow Thanks! That sounds good. Just send me the quote when you're ready.

dhslichter commented 7 years ago

I agree with @dtcallcock; my vision was that you have two IDCs to cover the 4 channels (allowing for profile select, plus fast rf switch open/close).

dhslichter commented 7 years ago

I doubt NIST will be buying any of these boards if the AD9912 is used. That is not intended as an attempt to influence the design, but rather just informational -- it would just be a major step below what we already have in terms of capability. I also think that one could end up making two board designs: an Urukul "basic DDS" board and another board with higher performance (fast switching, higher clock frequency) featuring AD9914s and two IDCs. This of course doubles the design and prototyping work and requires additional funding from someone else. I do think that the market for fast switching and higher output frequencies is substantial, and that it should be an important consideration for developing an ARTIQ hardware ecosystem in general. I recognize that Oxford has their own set of priorities here, and that they are focused on that.

gkasprow commented 7 years ago

@dhslichter We can always make two wersions of DDS board. One - simplified with minimal cost, second - high performance used with two IDCs.

dhslichter commented 7 years ago

@hartytp just curious, why the "much prefer" for AD9912?

dhslichter commented 7 years ago

@gkasprow you read my mind :)

gkasprow commented 7 years ago

@hartytp @dhslichter @jordens Guys, we can make variant design. We have some spare area and can place both DDS chips on both sides of the board. When you order board you simply specify the variant So we have same PCB and 2 versions of pick&place files which result in different board assembly. So ones who want 9912 will get it mounted with 1 IDC, others who want 9914 will get the same board but with other chip and two IDCs. This adds some work so additional founding would be very welcome. I'd like to buy 9914 devkit to fully characterise the board.

sbourdeauducq commented 7 years ago

This sounds quite complicated. Can we please keep the Kasli ecosystem simple?

gkasprow commented 7 years ago

@sbourdeauducq This is low cost alternative to having two boards design. We can even call them differently :)

hartytp commented 7 years ago

@dhslichter Maybe "much prefer" is too strong, but prefer nonetheless. To explain our priorities:

We are setting up an experiment which will have two ion traps, each with Ca + Sr. We will use Sayma for qubit manipulations, which require pulse shaping etc. For everything else (cooling AOMs etc) we hope to use Urukul, with Novogorny for noise eating (initially in software, but eventually in gateware).

Given that we need 60+ channels of RF for this, both cost and power consumption are important to us (our labs' AC is only 10kW). As the AD9912 can do everything we would want, and is cheaper and less power hungry, we prefer it.

Beyond that, I think there's a lot of advantage to keeping the Kasli ecosystem as simple and low cost as possible and avoid adding features that aren't actually needed.

IMO the profiles on the AD9914 aren't that useful since one can reprogram the DDS over SPI in ~1us, which is fast enough for most cases. Using up two IDCs for this design is a pain: more ribbon cable wiring (or backplane sockets taken up); and reduces the number of these one can run from a Kasli. This is particularly true for our initial experiments, where we want to run as many of these as possible from a single KC705.

Edit: in a similar vein, I'd be pro using the DDS's internal PLLs rather than an external synth IC unless it's really necessary...

sbourdeauducq commented 7 years ago

If we end up with both 9912 and 9914, I would prefer two board designs, and keep each one as simple as possible.

gkasprow commented 7 years ago

@sbourdeauducq Both designs differ just by one chip and having variants cuts half of development costs. So we don't loose anything. If someone wants to fund second design and prototype round, I'm open:)

dtcallcock commented 7 years ago

If there were to be two boards it would also allow better optimization for Tom's use case. For example, doubled AD9911 could be used at 1/2 cost and power consumption.

The 4ch AD9959 could also be used to go to 8ch per card for the same cost/power. This would increase development costs though as the two boards would be quite different.

hartytp commented 7 years ago

@dtcallcock If there were to be two boards it would also allow better optimization for Tom's use case. For example, doubled AD9911 could be used at 1/2 cost and power consumption.

I'd prefer to avoid frequency doubling if possible, since it's nice to be able to linearly vary the output amplitude by programming the DDS amplitude.

hartytp commented 7 years ago

Anyway, I don't want to take this conversation too far since we didn't fund this project and it's not our decision. I'm just stating a preference, but in the end we'd buy the boards either way.

jordens commented 7 years ago

@gkasprow The RF switches need to occupy eem lines directly. Plus scki SDI sdo selX io-update. The rest can go on the shift register. With two DDS per board we could do that with just one eem. Four would need two eem and would mean more spin bus contention. I am still undecided whether two or four DDS is the best approach here. Loop filter from devkit is good. My guess is that the 10x setting should be default. AFAICT we can have both pll (with e.g. 100 MHz to urukul) and direct drive clock (e.g. 1 GHz) with just the clockmodesel. Simple dumb fanout to the DDS is ok. With inputs from the mmcx, backplane or the front panel. If necessary with capacitor jumpers. I don't think we need a hmc830 here. Clock delays don't matter. The supply sounds good. @hartyp the current modus operandi is good. All comments and review are highly appreciated. @dhslichter ad9912 covers our use cases just fine. There are Nyquist images for higher freuqencies. Profile mode is not something we need right now. And we cant afford the eem pins. For more features we have sayma. This is very much the 'simple aom' driver with RF switch and attenuators you mentioned. @gkasprow I am ok with allowing the ad9914 as a bom alternative. But there should not be noticable compromises. Not in the number of board layers, not In The number of eem connectors required. Not in the power supply budget or board price or RF shielding quality... We'll have one prototype for QUARTIQ then two in the second batch. @dtcallcock the ad9911 and ad9959 would be great if they did 1gsps, 48 bit ftw.

hartytp commented 7 years ago

The RF switches need to occupy eem lines directly. I am still undecided whether two or four DDS is the best approach here.

In our case, where a large number of channels are needed, 4 DDS per IDC (and ideally, 4 DDS per card) would be much nicer than 2. For example, this allows one to run a fairly complex experiment from the FMCs on a KC705 without having to buy an additional bunch of Kaslis -- this would be particularly helpful until the bugs are ironed out of DRTIO + Kasli support.

Why do the RF switches need to be connected directly to the IDC? What maximum pulse rate do you want to support? I was hoping we could get away with an SPI -> 8 TTL type IC to control them (same bus as the DDS, using CS lines).

gkasprow commented 7 years ago

We can use DDR SCLK and two available LVDS lines to control 4 switches – rising edge controls one switch, falling controls another.

One quad flip-flop would solve the issue.

dhslichter commented 7 years ago

Agreed that a shift register with latching output would be a suitable method of doing the rf switches -- you get the clean timing control with a single latch line. But it does restrict the pulse durations to ~1 us or more. Now, if the goal is a 'simple AOM driver', that is probably fine.

jordens commented 7 years ago

From our view this is less about pulse rate and more about being able to simultaneously switch and with sufficient time resolution and unencumbered by bus contention. Otherwise we could just zero the ftw and would not really need switches at all.

jordens commented 7 years ago

@hartytp if you want to hook this up to a custom adapter board for kc705 you can bus-share spi and shift-register the switches there. The standardized eem IDC pinout allows it. Would having 2 IDC for 4 dds channels really be too much for you?

gkasprow commented 7 years ago

@jordens @hartytp The LVDS goes up to 150MHz easily, so with 8 bit shift register one can get 100ns or less IO toggle rate. Would it be too high period for switches and profile changes?

hartytp commented 7 years ago

@gkasprow The LVDS goes up to 150MHz easily, so with 8 bit shift register one can get 100ns or less IO toggle rate. Would it be too high period for switches and profile changes?

Agreed. Given that most AOM rise times are at least a few hundred ns, I don't see this as being an issue.

@jordens From our view this is less about pulse rate and more about being able to simultaneously switch and with sufficient time resolution and unencumbered by bus contention

With latching shift registers one can still switch channels simultaneously. This also doesn't limit the timing resolution, just the minimum gap between state changes.

Would having 2 IDC for 4 dds channels really be too much for you?

It suspect it would be quite inconvenient, but not a killer.

if you want to hook this up to a custom adapter board for kc705 you can bus-share spi and shift-register the switches there.

This is an option, but it'd be nice to get away with only the existing passive adapter boards.

Otherwise we could just zero the ftw and would not really need switches at all.

So long as there aren't any noise/cross-talk/clock leakage/etc issues, this might be okay for us.

My preference would still be to use the shift register but, if you think that won't work for you, then how about we connect the switches to the second IDC, and set the pull ups/downs to leave them NO. That way basic functionality (DDS SPI, attenuator etc) can be accessed from a single IDC. Users who want the switches can optionally connect a second IDC.

gkasprow commented 7 years ago

@jordens @hartytp What is typical control sequence of DDS chip? Do you need to simultaneously talk over SPI and update the switches or can setup the DDS registers once and toggle switches/profiles after that? In first case we need dedicated data and latch lines while in second case we don't need any additional wires since there is one free CS in address decoder.

dhslichter commented 7 years ago

Typical control sequence is for SPI programming to the DDS, followed by an IO_UPDATE pulse after potentially some delay. For example, if you want to change three DDS chips simultaneously, you program each in turn and then give them all an IO_UPDATE simultaneously. The rf switches might or might not be operated at the same time as an IO_UPDATE. If you are changing frequencies often, you might want to be programming some DDS chips for the next frequency update at the same time that you'd like to change the state of the rf switches, thus the bus contention that @jordens alludes to.

hartytp commented 7 years ago

The rf switches might or might not be operated at the same time as an IO_UPDATE.

I assumed that you'd generally always do the IO_UPDATE before (after) opening (closing) the switch to avoid glitches.

If you are changing frequencies often, you might want to be programming some DDS chips for the next frequency update at the same time that you'd like to change the state of the rf switches, thus the bus contention that @jordens alludes to.

Assuming that this is just a basic source for cooling AOMs etc, the lengths of the pulses will generally be long compared with the ~1us it takes to reprogram the DDSs via SPI. Thus there should be plenty of time to reprogram the DDSs after the switches have changed states.

But, I do take the point that dealing with shared busses in a real time system is always more complicated for the user.

gkasprow commented 7 years ago

@dhslichter Let's assume that switches are controlled by shift register clocked by SCKI (which is active all the time) and data are transferred over dedicated SWITCHES_DAT line. Then SPI core sets SEL_CSN3..0 = 111 and latches configuration. In this way we can run both transfers in parallel without conflicts with SPI. The only difference is that SPI returns to SEL_CSN3..0 = 111 instead of 000 after the transfer. The signal assignment would look like this:

LVDS1: SCKI LVDS2: SDI LVDS3: SDO LVDS4: SEL CSN3 LVDS5: SEL CSN2 LVDS6: SEL CSN1 LVDS7: IO_UPDATE LVDS8: SWITCHES_DAT

hartytp commented 7 years ago

@gkasprow NB the AD9912 max SPI clock is 50MHz, which will limit the speed we can run the shift register at if they're on the same bus.

gkasprow commented 7 years ago

We can transfer the register data on two edges of the SPI clock:)

dhslichter commented 7 years ago

OK, with IO_UPDATE and SWITCHES_DAT are global updates that latch the most recent serial data into DDS output registers and rf switch configurations, respectively. I think this works fine; as stated before, the bus arbitration could get complicated but the problem is basically being pushed on the end user.

@hartytp @gkasprow the ARTIQ SPI gateware doesn't allow for DDR right now, so that would have to be redone, and I am not sure how rapidly one can change over between SDR and DDR to do e.g. DDS programming, switch changes, back to DDS programming...there is nontrivial time overhead in the current SPI implementation on the FPGA for these things.

hartytp commented 7 years ago

@dhslichter the ARTIQ SPI gateware doesn't allow for DDR right now

Agreed, I expected that we'd just run the bus at single data rate at the maximum clock rate supported by the DDS. 8bits/50MHz=160ns, which is still plenty fast enough for all practical applications I can think of for this kind of hardware (i.e. it's fast compared with the rise time of most AOMs).

the ARTIQ SPI gateware doesn't allow for DDR right now, so that would have to be redone, and I am not sure how rapidly one can change over between SDR and DDR to do e.g. DDS programming, switch changes, back to DDS programming...there is nontrivial time overhead in the current SPI implementation on the FPGA for these things.

I'm not sure the factor of two is worth the complexity of supporting DDR, so this is a bit of a moot point, but... I'd guess that if you did go down that route, you'd just the core in DDR mode for the DDS as well, and copy each bit twice. That way they DDS don't notice the difference...

hartytp commented 7 years ago

Putting our priorities in more context:

I'd like to be able to set up a Eurorack with

1 x Kasli
2 x Novogorny
4 x Urukul
2 x other EEM, such as BNC or Zotino

Combining this with 2 x RF PA gives a very cost/space/power efficient setup for driving 16 AOMs (inc intensity servo) with enough performance/functionality/flexibility for the majority of our use cases (cooling/optical pumping/etc). Note that, given that we will have up to 60 AOMs in a lab (FWIW, these are also rack-mounted on optical bread boards with fibre inputs/outputs), space efficiency is a real priority for us, which is why we're keen to stick with 4 DDS per IDC if possible!

Initially, we'd do noise eating in software using the gateware/software that's already been funded. In the longer term, we'd want a Kasli servo gateware that used Novogorny in continuous sample mode (125kSPS/channel).

We would use Sayma for qubit-type stuff only. In most cases, only 1-2 Sayma will be required for an experiment. In this case, we'd hope to do away with the cost/complexity of uTCA chassis + Metlino by using Kasli as the ARTIQ master and mounting the Sayma in Greg's 1 unit chassis.

We'd only fallback to uTCA chassis/Metlino for really complex experiments that required many SAWG channels or (to be funded/designed) a lot of fast DAC channels for shuttling.

hartytp commented 7 years ago

Thinking about this a bit more, I feel that putting the switches on the same SPI bus as the DDSs is a bad idea: in the long run, we'd like to use these boards in a servo loop (feeding back on an input to Novogorny). For this, the servo gateware will update the DDSs continuously at max rate. Doing that while sharing the SPI bus with the switches would be really nasty.

In a similar vein, to make this practical, we need to keep the time between DDS updates down to a few us. Assuming we don't want to run the DDS SPI bus as DDR (needs extra FFs), that will probably mean limiting us to 2 DDS max per SPI bus (note to self: check the timings for this).

gkasprow commented 7 years ago

@hartytp Can we control DDS IO_UPDATE together with switches from same register? Two lines should be enough to control such register with independent data rate

gkasprow commented 7 years ago

The solution is fairly simple - you transmit DDR data to register running i.e. with 150MHz clock on rising edge. On falling edge you transfer latch state. So one shift register (i.e. AHC595)+ 1 flip flop would do the job. From FPGA side it would be trivial state machine with IODDR.

sinara-hw / sinara

3U DDS (URUKUL) discussion #191