Clock recovery in Sinara/ARTIQ

hartytp commented 5 years ago

Moving from https://github.com/sinara-hw/sinara/issues/515#issuecomment-416786782

Putting out our thought process so far in detail for anyone who's interested:

Current situation in ARTIQ:

we use the Si5324 for clock recovery
That originally seemed like an excellent choice (and, in many ways, it was an excellent choice) as it is extremely flexible and easy to use and has impressively low noise for an IC of its type
However, the Si5324 does not have deterministic input-output latency, and we've observed quite bad (around 100ps pk-pk) phase fluctuations in the output. This basically rules it out for any use-case that requires time distribution to much better than 1ns, such as generating data converter clocks

So, we are considering implementing something closer to the CDR element White Rabbit, using the DDMTD technique. Comments about this:

because of the relatively high noise floor of the FPGA CDR circuits, the BW for the PLL needs to be quite low (<100Hz). This makes a traditional analog PLL basically impossible because component values are not practical.
One approach would be to use an analog PFD + an ADC to digitize the signal, and some digitally-controlled crystal oscillator (DCXO) with a digital loop filter on an FPGA. Calculations suggest that would work fine, but it's a bit complex/expensive. Instead, we plan to follow WR and use the FPGA as the phase detector using the DDMTD approach developed by CERN. Measurements made by CERN and repeated by us suggest that the performance of the two techniques will be similar (although, we only really have data on noise for this, not on longer-term stability -- that's still to do)
Then the question is "what kind of DCXO to use"?
Traditional WR uses a VCXO + DAC. We didn't like that for two reasons:
- firstly, VCXOs are generally much (20dB or more) worse than comparably-priced XOs. And, that's taking the data sheet values, which are measured using a battery to provide the control voltage and in a controlled environment. In practice, some degradation of this noise floor is inevitable once the VCXO is included in a closed-loop system
- secondly, because we want to use the CDR PLL in a "hostile EMI environment" such as Kasli. Avoiding spurs/broad-band noise due to pickup on the tuning input is also quite challenging unless one adds screening etc
From these considerations, the optimal approach is probably to use a high-quality XO at something like 1GHz to clock a DDS, then tune the DDS output frequency to lock the RF phase. However, this ends up being quite bulky and expensive, so we decided against it
A close second is to use a 2-loop PLL DCXO architecture like the one in the SI5324. Here, a microwave VCO, typically running at 5GHz or so, is phase locked to an XO with a wide bandwidth. The fractional part of that phase lock is tuned to lock the output RF to the reference.
Chips like the Si549 and LMK61e07 are great for this, as they offer the XO, VCO and PLL in a single screened package, including power filtering + LDOs. Moreover, (in principle at least) the IC designers have taken care to minimize digital cross-talk, making life easier for the circuit designer.
The LMK61e07 looked like a better choice in terms of noise/availability, but we ended up ruling it out because it seems to have bad digital-RF cross-talk. The Si549 does not

So, our plan is:

Try using DDMTD + Si549 for our clock recovery
Measure phase stability and see if it's really good enough to distribute data converter clocks
if the phase stability is good then integrate this into Kasli. NB probably worth keeping the Si5324 as well since it's easier to do things like change RTIO frequency using that chip (maybe population options to switch between the two, depending on cost)

hartytp commented 5 years ago

NB to do: dig out the WR closed-loop phase noise plots and compare to Weida's data. IIRC, WR is quite a bit worse (partially because all the WR VCXOs I've seen used are a lot noisier than the Si549 for the reasons discussed above).

hartytp commented 5 years ago

NB also:

Mouser has some options for the Si549 in stock that would work fine AFAICT: https://www.mouser.co.uk/Silicon-Laboratories/Passive-Components/Frequency-Control-Timing-Devices/Oscillators/Programmable-Oscillators/Si549-Series/_/N-7jdx1?P=1y98pm2Z1z0zp1d
cost in quantity is only £10-£20 depending on the version, so not that expensive compared with other options
availability seems at least as good as Si571 etc, but it's a better solution AFAICT

@gkasprow

But adding little board with vcxo + DAC to Kasli may be simpler approach.

See my post above for the reasons I favor the DCXO to a VCXO + DAC.

dtcallcock commented 5 years ago

If the Si549 scheme continues to work out well, is the plan still to implement it in Kasli 2.0?

hartytp commented 5 years ago

@dtcallcock good question. I was putting off thinking about that until we knew how well it worked. But, I think we have enough data now to show that the performance of a WR PLL would be good enough to be useful for many applications.

At this point, I think we can say that there will be a version of Kasli with the WR PLL in. Whether that's (a) the "standard" version of kasli (b) a population variant (c) a fork of Kasli depends on interest from users.

We also have to decide what we want to do about the Si5324. My preference would be to keep it in Kasli as well but have some means of switching between them. The rational for this is that -- even if we do a good job of integrating WR with ARTIQ/misoc and producing design tools (e.g. a python loop filter design tool) -- the Si5324 is always going to be easier to get running, so it's good to have it there as a fallback option.

We need a fanout after the Si549 anyway, so there isn't much cost making it a (cheapish LVDS) mux and connecting the Si5324 to the second input (it can always be DNPd in any case).

hartytp commented 5 years ago

NB Weida has made some progress getting the TI chip working https://e2e.ti.com/support/clock_and_timing/f/48/t/723417# The noise for that chip should be the same as the Si549, but the documentation/design tools aren't as good so it's harder to find the optimal configuration.

So, it's a bit of a trade-off with the Si549 being less readily available, but much nicer to use. Thankfully, this isn't such a big deal since the two chips are close enough to being footprint + pin compatible that we should be able to use the same layout for both with some resistor jumpers on low-speed signals.

hartytp commented 5 years ago

Does anything say "preliminary data" more than a photograph of a plot on a monitor?

569a9f5e-8e53-4bb4-87bd-08904d9419d8

Anyway, here is a measurement of the stability of WR using the Si549. As above, we're using the recovered clocks from two independent KC705s. Now we're beating them together on a mixer as a phase comparator.

normalization needs double checking
we see < 1ps pk-pk over a few hours. Rms is <300fs (!) So, this is definitely good enough for quite a few applications
the data was taken in "office" conditions with no active humidity or temperature control. However, the two KC705s were within a metre of each other so most environmental fluctuations will be common mode.
to do: unplug the fan on one KC705 to measure the temp-co of the DDMTD
this setup is a hack of multiple eval boards, baluns, clockers, flexible (not phase stable) coax, etc. Based on previous experience, I would expect to see something of the same order of magnitude even if the DDMTD phase detector were perfect. Once we have WR integrated with Kasli we'll do a more careful measurement and also measure the interferometer stability.

hartytp commented 5 years ago

@jordens what are your thoughts about Kasli integration? Any objections to (a) making this a population variant on the mainline Kasli branch (b) making it a "default" population option?

hartytp commented 5 years ago

NB this is on the basis that we will commit to

integrating the WR PL with misoc/artiq and supporting it
providing optimized configuration settings (LF config etc) for a handful of potentially relevant frequencies
providing some basic python tools to help users pick settings for other frequencies

jordens commented 5 years ago

Is risk the only downside of making it the default? The let's do some more careful analysis, reviews and tests and go for it. And the loss of convenient loop filter configuration...

hartytp commented 5 years ago

Is risk the only downside of making it the default?

By "default" I mean, populating it by default.

For the time being, my proposal would be to keep both the Si5324 and the WR-PLL. My thinking is that we will need a fanout after the DCXO, so there isn't really any cost in making that fanout a mux. At that point the selection between the two clocking options can be done in software.

So, the only downside is the extra cost associated with the PLL.

Eventually, one might want to DNP the Si5324 to save money, but I'd argue that we should keep it around until we have a decent amount of experience with the WR PLL (no matter how careful one is on the test bench, there is always room for unexpected things to come up in the field).

hartytp commented 5 years ago

And the loss of convenient loop filter configuration...

So long as we do a decent job of the design tool, that shouldn't be an issue. If one isn't trying to scrape the last few dB of noise out, and only wants a simple type-II loop filter then the design tool can be pretty simple and intuitive.

jordens commented 5 years ago

What about the recovered clock output connection from the FPGA GTs? Is that another fanout? And the external reference input? Another one? And we need to make sure that we can turn off both oscillators individually. Otherwise the mux isolation will be problematic.

hartytp commented 5 years ago

What about the recovered clock output connection from the FPGA GTs? Is that another fanout?

You mean CLK_REC on Kasli? I'd have to double check, but I don't think we alter that connection at all since it's not needed for the WR implementation (the connection to the DDMTD input is internal to the FPGA).

And the external reference input?

The easiest thing would be to route the external reference directly to a differential input on the FPGA. Then route an FPGA output to the Si5324 input. That way the FPGA can do the muxing as required. You'll pick up a bit of high-frequency jitter, but nothing the Si5324 can't clean up (i.e. a lot less than you get from the recovered clock). In any case, I don't see that being a problem since the Si5324 isn't really suitable for generating noise/drift-critical clocks anyway due to the known issues with it.

Having said that, one might want a clock buffer between the SMA and the FPGA to provide some protection against abuse from users. If we go down that path, it could be a 2-output LVDS fanout. But, that's optional.

And we need to make sure that we can turn off both oscillators individually.

IIRC (but it's something I'd need to double check before finalizing the designs) DCXO and the Si5324 can both be shutdown in software. Otherwise, we can add a FET on their power lines.

So, essentially yes, the points you raise are all good and there are definitely some implementation details that need to be thought through before we commit to anything.

hartytp commented 5 years ago

Well, probably the best thing is not to worry to much about "default" population options. So long as no one objects to having the pads for the WR circuitry and maybe having to add an extra mux, the best solution is probably to agree to add it to the design and make a decision once we've got ARTIQ up and running with it.

hartytp commented 5 years ago

Trying to wrap up measurements on our test setup this week. Here is some more data:

more careful measurement of phase stability using a mixer and two independent KC7054 + Si549s
yellow and red curves are measurements of our WR PLL setup. The differences are small variations in how we do the clocking, see block diagrams below. We made this measurement because we needed to take some baluns back for another measurement, so swapped them for power splitters. If the stability were limited by the DDMTD then the yellow and red curves should be identical
blue curve is a null measurement checking the stability of the mixer. note that this setup does not include the mess of cabling, baluns etc in the DDMTD measurement, so it's not directly comparable
RMS phase stability for the two DDMTD setups is 700fs and 400fs, reference measurement is 70fs
the fact that one of the DDMTD measurements has significantly worse stability suggests that a significant proportion of the instability we measure is due to our measurement setup. Not surprising since we have several meters of standard (not phase-stable/t&m grade) coax and a bunch of baluns + splitters etc. Anyway, it hints that the DDMTD stability is likely good enough that it is non-trivial to make a lab setup that is limited by it! It also suggests that we can expect to measure better stability once we integrate this into Kasli/Sayma and do away with the rats nest
also shown is the modified Allan deviation

jitter measured between two ddmtds outputs on two setups at 125mhz

Block.pdf

modified allan deviation on between 2 ddmtds and setups and on reference channel instrument

hartytp commented 5 years ago

20181002_163525840_ios

Integrated jitter measurement. 1.5ps between 1Hz and 100MHz. NB this is a measurement of absolute jitter and so includes a significant contribution from the wenzel oscillator we use as a reference

gkasprow commented 5 years ago

That looks very good!. I'm just curious where the 170Hz peak comes from. next peak comes probably from SMPS.

hartytp commented 5 years ago

I'm just curious where the 170Hz peak comes from. next peak comes probably from SMPS.

@weidazhang

FWIW this setup is a mess of eval boards with a large loop area for pickup and somewhat poor grounding. I expect some of those spurs to vanish once we put it all on a pcb...

WeiDaZhang commented 5 years ago

It seems the 170 is quite much induced by the beating setup (amp + mixer + filter), or more likely their cables and wires. By removing them and leave the 2 x (KC705 + DDMTD + DCXO) run alone, the 170 disappears. 20181008_154740953_ios

WeiDaZhang commented 5 years ago

Well, some other spikes come in.

hartytp commented 5 years ago

@gkasprow so, here is the Sinara-WR proposal in full:

Main DCXO. Layout should support Si570, Si549 and LMK61e07. These are all footprint compatible, but not pin compatible. From a quick look, it seems that a few 0R jumpers should allow the same pads to be used for all DCXO choices. NB these are the only DCXOs we've found with close to the correct performance; Si570 is cheapest but lowest performance option; Si549 should be default population option; we need to chase the TI engineers to see why the LMK chip doesn't meet data sheet spec (see above link); if we get the LMK IC to work as expected then that would be the top choice due to better availability and potentially lower noise.
Helper DCXO (clock for the DDMTD DFFs). Any of the above DCXOs, use whichever is cheapest (is it more cost effective to save a BOM line by using a Si549 as well or is is cheaper to use a Si570?)
Main DCXO feeds a 2-input mux, whose other input is the SI549. This mux must have low close in phase noise, probably an ADCLK948.
An output from this mux goes to a MGTREF to clock the transcievers
An output from this mux goes to a differential input for the DDMTD DFF
Other outputs go to the EEM connectors/BP/etc
External reference SMA goes to clock-capable differential input for the other DDMTD DFF when this is the WR master
Helper PLL DCXO goes to a cc pin. Try to keep the helper PLL clock and both DDMTD inputs as physically close (adjacent pins) as possible to minimize routing inside the FPGA (don't want to use global clock network if we can avoid it)
Recovered clock output routed to an Si5324 input (this can be internally routed to a DDMTD DFF when WR is used)
Other Si5324 input driven by an FPGA output. This can be used, for example, to route the SMA clock to the Si5324. NB this shouldn't affect the noise since the Si5324 acts as a jitter cleaner. And, in any case, the problems with the Si5324 prevent it being used when clock quality is absolutely crucial.
Need fully independent I2C buses for both DCXOs, which must update at 1MHz clock rate
@WeiDaZhang can you double check that the Si549 can be fully shutdown (no RF output) via software? If not, let's add a FET switch on the power line

@gkasprow @sbourdeauducq how does that sound? Any questions? Do we have enough pins to implement this on Kasli?

hartytp commented 5 years ago

LMK61e07 only seems to reach the specified noise performance for some fractional divider ratios. So, when used in a feedback loop the noise can be quite a bit worse than the data sheet indicates. As a result, we don't think it's suitable for WR, although we haven't conclusively ruled it out yet (need to chase TI to get them to confirm that the numbers on the data sheet are not representative).

WeiDaZhang commented 5 years ago

Regarding the Sinara-WR proposal.12 by @hartytp Si549 has an OE pin on most of its packaging types (indicated with its suffix), and all the packaging types which we can buy from Mouser.

gkasprow commented 5 years ago

OK, I planned to release Sayma and Metlino schematics for review today, I need a few more days to implement these changes. I simply added WR oscillators as assembly option. Is that OK @jbqubit ?. @hartytp I will look at it tomorrow.

hartytp commented 5 years ago

OK, I planned to release Sayma

Wow! Are all issues on Sayma fixed already?

hartytp commented 5 years ago

Is that OK @jbqubit ?.@hartytp I will look at it tomorrow.

Sorry, what's the question here? Is what okay?

gkasprow commented 5 years ago

@hartytp We agreed with Joe that I will release schematics today :)

gkasprow commented 5 years ago

@hartytp do you care that much about oscillator output type? The SI549 with CMOS output ( 549CBAC000112ABG) is 3x cheaper than one with LVDS/LVPECL output (549BACB001937ABG). We don't care about initial stability, do we?

hartytp commented 5 years ago

@hartytp do you care that much about oscillator output type? The SI549 with CMOS output ( 549CBAC000112ABG)

@gkasprow hmm...in the data sheet they only give the phase noise for the LVDS and LVPECL options. The jitter specification for the LVCMOS output is about twice as bad, so I guess it's something like 6dB on the phase noise. Whether that's worth the money is really up to what users want.

Let's make sure we support both LVCMOS and LVPECL as population options. I'd be happy using the cheaper LVCMOS option as the default variant for now. For testing, we can try both options.

We don't care about initial stability, do we?

No, initial stability isn't important.

hartytp commented 5 years ago

Couple of other comments about this:

If we really need to save pins then it's probably okay to only route the DCXO to the MGTREF input and not to a separate DIFF input. The MGTREF can be routed to a DFF data input internally to the FPGA. We'd prefer to avoid doing that if possible since it requires routing a critical signal through a longer path inside the FPGA than going directly to an IOB DFF. However, we've done some initial testing with a partially loaded FPGA, which did not show a significant difference (not more than 3dB) between the two paths. Still, since we're building prototypes at the moment, it makes sense to keep all options open, which is why I'd like the DIFF input as well if possible.
The "default" configuration will be to run WR at 125MHz, with the master reference (from the SMA) also at 125MHz. We've also done some tests using a PLL inside the FPGA to allow a 100MHz reference to be used with a 125MHz WR frequency. This seems to work with no significant performance degradation (both in terms of long-term stability and phase noise). However, again, we only have limited data for this, so can't guarantee this will be okay in all conditions/FPGA loadings.

gkasprow commented 5 years ago

@hartytp why do we need double connection between FPGA and input if Si5324? Here is clock recovery schematic. CLK_recovery.pdf FPGA clock connections obraz Control signals obraz

hartytp commented 5 years ago

@hartytp why do we need double connection between FPGA and input if Si5324?

You're right, we probably don't need both Si5324 connections. So long as we can route the recovered clock to the Si5324 as well as the SMA input clock then it's fine.

NB the SMA clock must go to the DDMTD without passing through the Si5324. So, we have two options: (1) connect the SMA clock directly to the FPGA and then use the FPGA to route it to the Si5324 when needed (this is currently my preferred option) (2) add a fanout buffer.

gkasprow commented 5 years ago

On Sayma we have one SMA connector for clock which is used as output of clean clock for Sayma RTM. We have also 2 general purpose SMA IOs, but they are not that fast and have high jitter.

hartytp commented 5 years ago

On Sayma we have one SMA connector for clock which is used as output of clean clock for Sayma RTM.

For the DRTIO master one usually needs an external high-quality external clock.

As you say, the GPIO are not suitable due to the clock buffers used.

So, we have a two obvious choices: do not use Sayma as master with external clock; connect one SMA to a clock buffer.

@WeiDaZhang @sbourdeauducq Please can you review the schematics Greg posted?

@gkasprow I'll try to review tomorrow. One point for now: are you sure you want to use the low-noise 3V3 rail for the I2C pull-ups? Isn't there too much risk of coupling noise?

gkasprow commented 5 years ago

I was thinking about it. I'm not sure if DCXOs are resistant to noise that enters via their I2C interface if we connect it to 3V3. To mitigate it I can connect them via RC filter to P3V3.

gkasprow commented 5 years ago

We will have Metlino that will function as DRTIO master.

gkasprow commented 5 years ago

on Sayma we have UFLs at the input of Silabs and FPGA so can connect external SMA with splitter when needed, using pigtail.

hartytp commented 5 years ago

I was thinking about it. I'm not sure if DCXOs are resistant to noise that enters via their I2C interface if we connect it to 3V3. To mitigate it I can connect them via RC filter to P3V3.

They do have internal LDOs and power filtering, so should be quite robust, but it's still best to be careful about this. Extra RC filtering sounds good.

hartytp commented 5 years ago

We will have Metlino that will function as DRTIO master.

Yes, I do not expect Sayma to be used as the DRTIO master often (we also have Kasli for that), but it might be useful during testing.

on Sayma we have UFLs at the input of Silabs and FPGA so can connect external SMA with splitter when needed, using pigtail.

Sounds good. Although, I'd be tempted to use MMCX for this instead of UFLs. UFL is not robust and doesn't endure many mating cycles, so it's fine for the occasional debugging, but not for regular use. We can always DNP the MMCXs later on to save cost.

gkasprow commented 5 years ago

OK, will use MMCX instead. Do you think it is necessary to mount active splitter or use passive one?

hartytp commented 5 years ago

Where do you need a splitter?

gkasprow commented 5 years ago

to split clock between SI5324 chip and FPGA.

hartytp commented 5 years ago

Do we need that? What about just routing the external clock to the FPGA? Then, if required, the FPGA can route that clock to the Si5324. The performance shouldn't be affected since the Si5324 is very good at removing added jitter (and, anyway, the SI5324 cannot be used in applications which require very good clocks).

gkasprow commented 5 years ago

OK, take into account that jitter is high, order of tens of ps.

hartytp commented 5 years ago

Yes, but that's no different from when the SI5324 is used with CDR. And, anyway, almost all of that jitter is removed by the SI5324 so I don't think it's a problem. Others should feel free to comment on this design decision.

gkasprow commented 5 years ago

That's true:) I forgotten that we are already using FPGA to deliver dirty clock :D

hartytp commented 5 years ago

Comments:

let's connect the external clock input to the FPGA via a suitable balun like TCM2-43X+ that goes down to 10MHz, but also works well for clocks with fast rising edges
Why is the XTAL net labelled SI5328?
What is the XTAL PN? Should be annotated on the schematic and should match Kasli
let's add the usual test points on power, I2C etc
on Kasli we didn't bother connecting SI5324 INT_C1B do we need to connect this,
can the FPGA handle the LVPECL signals directly, or do we need to add some attenuation?
where are you sourcing the SI549 from? And, just double checking, but are you sure that the LVCMOS model is really that much cheaper than the LVDS/CML/LVPECL models? Also, are you able to find this in stock from somewhere?
NB the temperature stability of the chip is important, since it allows air currents to lead to phase drifts in the DCXO output. Since the WR loop BW isn't that high, we want to make sure the temp co is reasonably low. You've chosen the B-grade (10ppm) SI549, which is a good choice IMHO: the 7ppm C-grade only offers a small improvement.
AFAICT, the Si570 can be used as a replacement for the the Si549 in this design. Is that correct? If so, please add an annotation to that effect. The only thing that needs to change for the SI570 AFAICT is that the OE is pin 2, so let's add a 0R resistors that are used to connect OE to either pin 1 or 2.
IIRC the Si570 is cheaper than the Si549 and is suitable for the helper DCXO. Did you decide that it's more important to minimise the number of BOM lines than to use cheaper components in this design (I don't mind, I'm just curious).
let's have a pair of UFL connectors on the helper PLL to aid debugging
Do you need separate footprints for the two DCXOs? Aren't they footprint compatible (@gkasprow please check and confirm this), so a single footprint can be used for both?
why is there a 0R resistor to bypass the ac coupling cap on the helper DCXO output, but not for the main DCXO? Is this really necessary for either?
@gkasprow the schematic symbol you have there is only compatible with the LMK61E2, not with the LMK61E07 AFAICT. This should be noted on the schematic. @WeiDaZhang are you happy with this oscillator, or do you prefer the LMK61E07?
@gkasprow @sbourdeauducq @WeiDaZhang what are your thoughts about which FPGA pins to use for the WR PLL? Ideally, I think we should keep the DCXO inputs as close to the transceiver clock recovery circuitry as possible to keep the two DDMTD FFs close together.
@gkasprow the schematics you posted don't show the FPGA MGTREF clocks. For Sayma and Metlino, someone must check that we drive the required MGTREFs so that all transceivers can be clocked correctly (IIRC there are some quite tight constraints about how the transceivers must be clocked).
schematic cosmetics need some work.

Anyway, other than that, all looks good to me!

hartytp commented 5 years ago

@gkasprow one other question: does anything else share the I2C bus with the SI5324? If not then we should connect the helper DCXO to the same I2C bus as the SI5324 (but make sure there is not an address clash). We will never use the Si5324 and WR in the same design, so this I2C bus can be shared to save some pins.

If there is anything else on the same I2C bus as the Si5324 then this won't work, since the helper DCXO cannot share a bus with any active components as it needs to update continuously at maximum rate.

gkasprow commented 5 years ago

Si5324 is connected via I2C switch. both DCXOs have dedicated I2C buses.

hartytp commented 5 years ago

Si5324 is connected via I2C switch.

Okay, that's what I wasn't sure about. In that case, let's keep it as it is: fully separate I2C busses for both DCXOs.

sinara-hw / meta

Clock recovery in Sinara/ARTIQ #15