sinara-hw / Sayma_RTM

RTM board with 8-channel GS/s DAC, 125MS/s ADC and flexible clock circuit
8 stars 4 forks source link

ADF4356 testing writeup #28

Closed hartytp closed 5 years ago

hartytp commented 5 years ago

Test setup:

To check for phase determinism, I'm:

hartytp commented 5 years ago

another check:

hartytp commented 5 years ago

One comment about this: the PLL prescalers mean that the minimum N divider is 23. As a result, for feedback through the output divider, the minimum divide ratio is 23*output divider. e.g. for 600MHz output divider is 8, so minimum divider is 184. Given that the VCO runs at 4.8GHz, this gives a required PFD frequency of 26MHz. That's okay, but will result in worse noise (usually 3dB per factor of 2 in PFD frequency).

In the first instance, I suggest that we just live with slightly worse in-loop phase noise, as it will still be fine (I'll post a noise model later on).

Longer-term we have two resolutions:

  1. Run at higher DAC clocks using interpolation (assuming this doesn't hit a bug in the DACs)
  2. Do not do the feedback through the output divider. Attempt to use the DAC as a phase detector to determine the PLL output phase. Then, reset the PLL until it starts with the desired phase. This should work, but will require some development. This will be easy to play with on the new Sayma board assuming all DAC/PLL control is done from kernels...
hartytp commented 5 years ago

image

https://www.analog.com/en/technical-articles/phase-alignment-and-control-on-the-adf4356-5356-devices.html#

NB PLL phase temp co is approx 2.5ps/C, which is not negligible...

hartytp commented 5 years ago

uurgh...looking closer, there is something funny going on with the phase. It seems to be non-reproducible at the 50ps level.

Potentially relevant from the above article

Given that the ADF4356 PLL and VCO contain 1024 differing VCO bands, it is important that this uncertainty is eliminated by using the manual calibration override procedure.

hartytp commented 5 years ago

(although, given that the VCO is in a feedback loop and the article is about feedback after the output divider, it's not clear to me what the VCO hand has to do with anything).

hartytp commented 5 years ago

looks like this is what we're hitting: https://ez.analog.com/rf/f/q-a/71780/adf5355-integer-n-phase-synchronization. As the top post says, there is clearly more to these chips than documented or shown on the BDs.

I'll come back to this on Monday. The only thing I can think of is disabling the VCO auto cal. No idea whatsoever why that would make a difference, but some of the comments indicate it could be relevant.

If that doesn't work then I think the conclusion is going to have to be that this PLL is completely unacceptable for our use cases. At least with the HMC830 operating in integer-N mode AFAICT the phase jumps were an integer number of VCO cycles, so it should be possible to use a phase detector to take it out.

I think the plan needs to be this: I'll focus on this on Monday. If by the end of Monday I'm still baffled, we ditch this PLL and focus on using the HMC830.

hartytp commented 5 years ago

Here is the code I used for these tests.

""" RTIO driver for the Analog Devices Inc. ADF4356 PLLs.

Output event replacement is not supported; issuing commands at the same
time is an error.
"""

from numpy import int32

from artiq.language.core import kernel, portable, delay
from artiq.language.units import ns, us, ms, MHz
from artiq.coredevice import spi2 as spi

# to do: output n and r div to see duty cycle and also reset of dividers...

SPI_ADF4356_CONFIG = (0*spi.SPI_OFFLINE | 1*spi.SPI_END |
                      0*spi.SPI_INPUT | 0*spi.SPI_CS_POLARITY |
                      0*spi.SPI_CLK_POLARITY | 0*spi.SPI_CLK_PHASE |
                      0*spi.SPI_LSB_FIRST | 0*spi.SPI_HALF_DUPLEX)

@portable
def ceil(x):
    return int32(x) if (float(int32(x)) == x) else (int32(x) + 1)

class ADF4356:
    """ Driver for Analog Devices Inc. ADF4356 PLLs.

    :param spi_device: SPI bus device name
    :param le_device: SPI load enable channel name
    :param muxout_device: PLL muxout device channel name (used for lock detect)
    :param ref_div2: enable the reference divide by 2 block
    :param ref_doubler: enable the reference double
    :param ref_cnt: reference divider
    :param f_ref: reference clock frequency (Hz, default: 125e6)
    :param chip_select: value to drive on SPI chip select lines during
      transactions (default: 1)
    :param div_write: SPI clock divider for write operations (default: 8,
      50MHz max SPI clock with {t_high, t_low} >=10ns)
    """
    kernel_invariants = {"bus", "load", "muxout", "ref_cnt", "ref_div2",
                         "ref_doubler", "f_ref", "f_pfd", "adc_clk_div" ,
                         "adc_t_sample", "chip_select", "div_write", "core"}

    min_ref_freq = 10e6
    max_ref_freq = 250e6  # SE ref, no doubler
    min_vco_freq = 3.4e9
    max_vco_freq = 6.8e9
    max_pfd_freq = 125e6
    max_out_div = 64
    max_ref_cnt = 1023
    min_n_45 = 0x17
    max_n_45 = 0x7fff
    min_n_89 = 0x4b
    max_n_89 = 0xffff
    max_band_sel_timeout = 0x3ff
    mod1 = 0x1000000  # fixed numerator for primary modulus
    idn = 0xad  # magic PLL identifier

    # muxout options
    muxout_three_state = 0x0
    muxout_dvdd = 0x1
    muxout_gnd = 0x2
    muxout_analog_ld = 0x5
    muxout_digital_ld = 0x6

    reg3 = 1 << 25  # phase sync
    reg5 = 0x80002  # magic!
    reg8 = 0x1559656  # magic!
    reg11 = 0x61200 | (0 << 20)  # VCO band hold off
    reg12 = 0x5f | (1048575 << 8)  # phase resync

    Icp = 0.9  # charge pump current (mA)

    def __init__(self, dmgr, spi_device, le_device, muxout_device,
                 ref_cnt, ref_div2, ref_doubler, chip_select=1, div_write=8,
                 f_ref=125e6, core="core"):
        self.bus = dmgr.get(spi_device)
        self.load = dmgr.get(le_device)
        self.muxout = dmgr.get(muxout_device)
        self.f_ref = f_ref
        self.chip_select = chip_select
        self.div_write = div_write
        self.core = dmgr.get(core)

        if not (self.min_ref_freq <= f_ref <= self.max_ref_freq):
            raise ValueError("Unsupported reference frequency")

        self.ref_cnt = ref_cnt
        self.ref_div2 = ref_div2
        self.ref_doubler = ref_doubler
        self.f_pfd = f_ref*(1+ref_doubler)/(ref_cnt*(1+ref_div2))
        self.adc_clk_div = min(255, int32(ceil(((self.f_pfd/100000.)-2.)/4.)))
        self.adc_t_sample = (self.adc_clk_div*4+2)/self.f_pfd

        self.t_le = max(20*ns, 2/self.f_pfd) + 20*ns  # t7 = max(20ns, 2/f_pfd)

        self.rf_div = 0
        self.n_int = 0
        self.frac1 = 0
        self.frac2 = 0x0
        self.mod2 = 0x2
        self.prescaler_en = 0
        self.adc_clk_div = 0

    @portable
    def reg0(self, autocal):
        """ Integer-N divider and autocal.

        Toggling autocal is required when changing frequency.
        """
        return ((self.n_int & 0xffff) |
                ((self.prescaler_en & 0x1) << 16) |
                ((autocal & 0x1) << 17))

    @portable
    def reg1(self):
        """ Main fractional value """
        return self.frac1 & 0xffffff

    @portable
    def reg2(self):
        """ Auxiliary frac/mod LSB """
        return (self.mod2 & 0x3fff) | ((self.frac2 & 0x3fff) << 14)

    @portable
    def reg4(self, muxout, ref_cnt):
        """ Returns the data for PLL register 4

        :param muxout: controls the PLL muxout pin, one of ADF4356::muxout_*
        :param ref_cnt: r counter value to use
        """
        Icp_mu = int32(round(self.Icp/0.3)) - 1
        return ((0x1 << 3) |  # phase detector polarity
                (0x1 << 4) |  # 3V3 muxout logic levels
                (0x0 << 5) |  # single-ended reference input
                ((Icp_mu & 0xf) << 6) |
                (0x0 << 10) |  # don't double buffer output divider
                (ref_cnt << 11) |
                (self.ref_div2 << 21) |
                (self.ref_doubler << 22) |
                ((muxout & 0x7) << 23))

    @portable
    def reg6(self):
        bleed_current = 0
        if self.frac1 != 0 and self.f_pfd <= 100*MHz:
            bleed_current = int32(24*self.f_pfd/(61.44*MHz)*(self.Icp/0.9))
            if bleed_current > 255:
                raise ValueError("Invalid bleed current")

        self.core.break_realtime()

        return (0x3 |  # RFA output power to +5dBm
                (0x1 << 2) |  # enable RFA output
                (0x3 << 3) |  # RFB output power to +5dBm
                (0x0 << 5) |  # disable RFB output
                (0x1 << 7) |  # mute outputs until locked
                ((bleed_current & 0xff) << 9) |  # charge pump bleed current
                ((self.rf_div & 0x7) << 17) |  # output divider
                (0x0 << 20) |  # feedback after output divider
                (0x0 << 21) |  # RF B is a copy of RF A
                (0x5 << 22) |  # magic!
                ((0x0 if bleed_current == 0 else 0x1) << 22) |  # bleed enable
                (0x0 << 23) |  # do not gate bleed currents to speed up lock
                (0x0 << 24))  # bleed current polarity set to negative

    @portable
    def reg7(self):
        """ Lock detect register """
        return ((1 if self.frac1 == 0 else 0) |  # lock detect mode
                (0x3 << 1) |  # 12ns LD precision, used with bleed currents
                (0x1 << 3) |  # LOL configuration in case REF_IN may drop out
                (0x3 << 3) |  # frac-N lock detect cycle count
                (0x1 << 21) |  # internally re-register LE from ref clock
                (0x1 << 22) |  # magic!
                (0x0 << 23))  # re-register LE with falling edge of ref clock

    @portable
    def reg9(self):
        """ Lock time """
        alc_wait_timeout = 30  # see data sheet
        synth_lock_timeout = 12  # see data sheet
        band_sel_timeout = int32(ceil(50*us*self.f_pfd/alc_wait_timeout))
        if band_sel_timeout > self.max_band_sel_timeout:
            raise ValueError("Invalid band select timeout")

        vco_band_div_clk = int32(ceil(self.f_pfd/1.6e6))
        if vco_band_div_clk > 0xff:
            raise ValueError("Invalid VCO band division clock")

        return ((synth_lock_timeout & 0x1f) |
                ((alc_wait_timeout & 0x1f) << 5) |
                ((band_sel_timeout & 0x3ff) << 10) |
                ((vco_band_div_clk & 0xff) << 20))

    @portable
    def reg10(self):
        """ Calbration ADC """
        return (0x1 |  # ADC enable
                (0x1 << 1) |  # ADC conversion after write to reg10
                (self.adc_clk_div << 2) |
                (0x300 << 10))  # magic!

    @portable
    def reg13(self):
        """ Auxiliary frac/mod MSB """
        return ((self.mod2 & 0xfffc000) >> 14) | ((self.frac2 & 0xfffc000))

    @kernel
    def write(self, addr, data):
        """ Writes a 28-bit data word to a PLL register """
        self.load.off()
        delay(50*ns)
        self.bus.write((data << 4) | (addr & 0xf))
        delay(50*ns)
        self.load.on()
        delay(self.t_le)

    @kernel
    def init(self, blind=False):
        """ Initialise the SPI bus and check for the PLL's presence.

        This method must be called before any other method at start-up or if
        the SPI bus has been accessed by another device.

        :param blind: If ``True``, do not attempt to identify the PLL.
        """
        self.bus.set_config_mu(SPI_ADF4356_CONFIG, 32, self.div_write,
                               self.chip_select)
        self.load.on()
        delay(self.t_le)

        if not blind:
            high = self.reg4(self.muxout_dvdd, self.ref_cnt)
            low = self.reg4(self.muxout_gnd, self.ref_cnt)
            delay(1*ms)
            for idx in range(8):
                magic = (self.idn >> idx) & 0x1
                self.write(4, high if magic != 0 else low)
                delay(10*us)
                if self.muxout.sample_get_nonrt() != magic:
                    raise ValueError("Unable to identify PLL")

    @kernel
    def set_frequency(self, frequency):
        """ Update the PLL frequency and wait for it to relock.

        For simplicity, we completely reinitialise the PLL for each update
        using the recommended init sequence (see data sheet).

        :returns: the actual PLL output frequency
        """

        # determine output divider and VCO frequency
        min_f_out = self.min_vco_freq/self.max_out_div
        if not (min_f_out <= frequency <= self.max_vco_freq):
            raise ValueError("Unsupported output frequency")

        self.rf_div = 0
        f_vco = frequency
        while f_vco < self.min_vco_freq:
            self.rf_div += 1
            f_vco *= 2

        n = f_vco/(self.f_pfd*(1<<self.rf_div))
        self.n_int = int32(n)
        # self.n_int = 0
        self.frac1 = int32((n - self.n_int)*self.mod1)
        self.frac2 = 0  # not implemented yet
        self.mod2 = 0x2  # not implemented yet

        if self.n_int > self.max_n_45:
            self.prescaler_en = 1
            if self.n_int > self.max_n_89:
                raise ValueError("Unsupported n divider value")
        else:
            self.prescaler_en = 0
            if not (self.min_n_45 <= self.n_int <= self.max_n_45):
                raise ValueError("Unsupported n divider value")

        if self.f_pfd > 75*MHz:
            raise ValueError("PFD frequencies above 75MHz not supported yet")

        delay(10*ms)  # core device maths is slow!

        self.write(13, self.reg13())
        self.write(12, self.reg12)
        self.write(11, self.reg11)
        self.write(10, self.reg10())
        self.write(9, self.reg9())
        self.write(8, self.reg8)
        self.write(7, self.reg7())
        self.write(6, self.reg6())
        self.write(5, self.reg5)
        self.write(4, self.reg4(self.muxout_digital_ld, self.ref_cnt))
        self.write(3, self.reg3)
        self.write(2, self.reg2())
        self.write(1, self.reg1())
        delay(16*self.adc_t_sample + 10*us)
        self.write(0, self.reg0(1))

        for _ in range(1000):
            if self.get_locked() != 0:
                delay(1*ms)
                self.bus.write(0)
                self.load.off()
                return
            delay(1*ms)

        self.bus.write(0)
        self.load.off()

        raise ValueError("PLL lock timeout")

    @portable
    def get_frequency(self):
        """ Returns the current PLL frequency (Hz) """
        return self.f_pfd*(
            self.n_int + (self.n_frac1 + self.n_frac2/self.mod2)/self.mod1)

    @kernel
    def get_locked(self):
        self.muxout.sample_input()
        return self.muxout.sample_get()
hartytp commented 5 years ago

Observation that I'm at a loss to explain:

hartytp commented 5 years ago

OK, I have a theory...I suspect that the VCO has a non-negligible leakage current that causes CP offsets. c.f. https://ez.analog.com/members/icollins

That seems pretty nasty and might also explain why the phase-temp co of this chip is relatively large (see above). I'll have a go at sticking an OpAmp into the feedback loop...

hartytp commented 5 years ago

Some comments about prioritization/planning:

  1. Using manual VCO calibration does seem to fix this issue. So it must be something like VCO leakage current etc. I would guess that any wide-band PLL we might use (HMC830,ADF4356/HMC7044) will have this issue to a certain degree. The only way to know which one is best would be to characterize each of them, which would be time consuming.
  2. Even with the large glitches removed, the VCO temp co still seems quite large (see plot above). Using an active loop filter may well help (removes VCO leakage by buffering the charge pump output). But, that needs testing
  3. Based on this data, phase drifts in the PLL look like they could be a real killer. We should try to characterize this and choose the most stable PLL possible. Of course if the DAC turns out to be worse, this is a non-issue
  4. It's a shame that we have to use relatively low PFD frequencies with the ADF4356 to take advantage of the synchronised output dividers. That degrades the in-loop phase noise which is quite a bit worse than the HMC830. Probably not an issue for ion trap applications, but worth knowing
  5. We don't need the output dividers if we can use the DAC as a phase detector to reset the PLLs until they start with the right phase. I'll aim to demonstrate that once @sbourdeauducq is able to help me port JESD init/reset to kernels. The PLL reset technique will work better with the HMC830 because the VCO operates at a lower frequency, meaning the phase is quantized in larger steps.
  6. If the PLL reset technique works reliably then I don't have a strong preference between the two PLLs. At the moment, I'd tend to opt for the HMC830, since it's simpler and lower noise. The power cycler should take care of the SPI issues.
  7. Whichever PLL we choose to use, I think it would be good to keep a "plan B" up our sleves. I'd suggest the following:
    • we will need a fanout buffer after the PLL. Let's make this a mux with the unused input connected to a pair of MMCX/UFL connectors
    • let's add an IDC with some DIO and power on it, as well as some mounting holes near the PLL
    • if we need to then we can hack in a small PCB with an alternate PLL on it and hook it up to sayma RTM via ribbon cable (power + digital) and coax (RF). this would be a reimagination of the previous clock mezzanine, but only as a hack-of-last-resort...
gkasprow commented 5 years ago

what about the LMX2594? We love ADI chips, but TI does a good job as well. Xilinx is using these PLLs to drive ADCs and DACs directly on RFSoC devkit. Here is their block schematic

obraz

jbqubit commented 5 years ago

Thanks for the hard work exploring the ADF4356 @hartytp. The ADF4356 is clearly not a panacea.

How did you measure the 50 ps jitter given the 100 ps jitter of your scope? Could there be start-up phases more finely spaced than 50 ps that might be obscured by jitter?

I think it would be good to keep a "plan B" up our sleves.

Agreed. At this point shifting focus back to HMC830 and planning for a simple clock mezzanine fall-back is appealing. I'm interested in what others think.

hartytp commented 5 years ago

How did you measure the 50 ps jitter given the 100 ps jitter of your scope?

It's not jitter, but rather a phase offset. I measure it by averaging.

Could there be start-up phases more finely spaced than 50 ps that might be obscured by jitter?

Yes, one can only rule out what one can see. I believe that we've understood and can eliminate the mechanism behind the issue I was seeing (and that it will occur on any similar PLL to an extent) but that doesn't rule out a smaller effect with a different origin.

what about the LMX2594? We love ADI chips, but TI does a good job as well. Xilinx is using these PLLs to drive ADCs and DACs directly on RFSoC devkit. Here is their block schematic

It's another similar part. Ultimately, there are lots of ways of skinning a cat and we have to pick one and make it work. All options have benefits and weaknesses. At the level of phase control we want this isn't a simple problem whichever approach we take.

gkasprow commented 5 years ago

I will get the RFSOC devkit in roughly 8 weeks. It's probably too late to measure it and decide...

hartytp commented 5 years ago

@gkasprow to do that you'd need to write a full artiq driver. Measure phase noise, check phase synchronisation, check temp co, check it locks reliably on each version with the artiq driver, etc. It's not a quick thing to test. Let's pick one approach and focus on making it really good, rather than picking a new part for each iteration.

hartytp commented 5 years ago

One other point about this: I'm running the PFD around 20MHz, with the standard loop filter on the eval board and 0.9mA Icp (relatively low, but what the data sheet recommended). i.e. none of that has been optimized. I would expect that with an optimized loop filter and higher f_pfd (can't use feedback after the divider) these glitches would be much much smaller...

gkasprow commented 5 years ago

I assume that devkit comes with software support. So I'd run it in the default configuration and look at the phase relationship between input and output clocks, then reboot and look once again.

hartytp commented 5 years ago

Since I've got this PLL setup already, I'll make a quick phase temperature measurement tomorrow with a hot air gun and check how the temp co depends on Icp and f_pfd. If we see a strong dependence of the phase stability on the loop gain (Icp/f_pfd) then it tells us that the phase stability is dominated by the loop's ability to drive the PFD error signal to zero. As a result, we should implement an active (3rd order) loop on Sayma (same as we do for WR). We can take this design from @WeiDaZhang's clock mezzanine design.

However, we don't have the bandwidth/resources to exhaustively characterise multiple PLL chips. So, after that, we have to make a decision about which PLL chip we want to use before we can move forward.

If the HMC830 is our choice then I think the order of priorities needs to be:

  1. using an eval board, measure the size of the HMC830 output band-select glitches when used in fundamental mode (no output divider). Verify that these go away when the PLL auto calibration is disabled.
  2. using an eval board make a quick phase temperature stability measurement (passive loop filter)
  3. Verify that we can use the DAC as a phase comparator to allow us to make the HMC830 output phase deterministic when the output divider is used. I'll need help from @sbourdeauducq to do this, since it requires porting rust code to kernels
  4. (can be done in parallel with 3) prototype the active loop filter using an eval board

If we don't go for the HMC830 then we need to decide on the work package for the ADF PLL.

Before tackling this, I'll write up the clocking plans so that M-Labs can sign off on them...

hartytp commented 5 years ago

@sbourdeauducq asked

have you tested the HMC830 for the same phase instability issues that the ADF chip has? also, what is the level of those instabilities? I'm doing some tests with DAC synch now and getting interesting results, so, if there are issue swith the HMC stability, what level of precision should I be looking at?

On the ADF4356 I observed variations in the output phase across power cycles that were generally around 50ps, but pk-pk over 10 or so power cycles was (from memory) closer to 200ps.

My understanding of the origins of these glitches is as follows: