Closed hartytp closed 5 years ago
@hartytp No more changes. Let @gkasprow focus on keeping track of bugs found during the design review. Please hold these suggestions until after Sayma v2.0.
@jbqubit
This is not a new addition to the project plan, it's finishing a job we've already agreed to do and have done most of the work for. There is no point at all having WR on the AMC if it's not also on the RTM, since that's where we actually need a good clock. The most time consuming part was designing WR for the AMC and reviewing that (already done), copying it to the RTM will be relatively quick.
The point of a structured review process is to avoid pitfalls including feature creep. The time for lobbying to add new features to the design has passed. WR was added to the AMC as a secondary clock source to permit prototyping of DDMTD on Sinara hardware. It is not among my requirements in Sayma v2.0 and I'm not funding its development. As this change relates to a "long term" aspiration for Sayma let's discuss after Sayma v2.0 is working. It may be that it can be added to production hardware.
@jbqubit Well, technically I don't think that both m-labs and @hartytp signed of on HT1.5 in table. Also, I wouldn't call it a feature creep, this was discussed and, as @hartytp said, reviewed before. In schematics this change should be just replacing Si5324 sheet with one from AMC + adding I2C to FPGA. (@gkasprow correct me if I'm wrong)
TBH I'm not sure if there's a point in having WR only on AMC (I was surprised, that it wasn't present on RTM), after all we'd like to synchronize DAC output on Saymas, not just FPGA clocks. WR support on RTM could be of interest to other groups, which already use WR with MicroTCA.
Reviewers for clock schematics are me and @hartytp, and I just started checking issues in clock schematics. We can easily wait few more days and check something else in the meantime.
WR was added to the AMC as a secondary clock source to permit prototyping of DDMTD on Sinara hardware
No, that's not why we added it here. WR will be prototyped on Kasli v1.2 not Sayma. I would not use a design as complex as Sayma to prototype this.
The point of adding WR on Sayma was to take the WR system we develop on Kasli and test its performance for clocking high-speed data converters. The clocking on Sayma, which will require high-stability reference clocks distributed to each RTM, is liable to turn into a bit of a mess without WR, which is why I'm keen to have a go at implementing it (and, indeed @dtcallcock an others have indicated that they would also value WR for this). We can't test this without WR on both the RTM and AMC. There is almost no point I can see in having WR on the AMC alone.
It may be that it can be added to production hardware.
The whole point of prototyping is to prototype the system you want. Not to leave important sections (I would argue that in the long run WR will be important) out and then stick them on without testing for production.
The point of a structured review process is to avoid pitfalls including feature creep. The time for lobbying to add new features to the design has passed.
This isn't adding a new feature, it's clarifying the scope of a feature already on the project plan. As far as I'm concerned, implementing WR on Sayma at all means implementing it on both AMC and RTM, otherwise it's just a waste of time.
WR is a block of hw that can be copied from one design to another relatively easily. The bulk of the time is spent designing that block, not copying it from the AMC to RTM. Putting it on the RTM is a small amount of extra work that allows us to make use of the work we've already done.
@marmeladapk I think that WR on AMC is useful or even required for other users of Sayma_AMC.
@hartytp No more changes.
@jbqubit What is the point of asking for reviews if you dismiss important comments like this one? WR on RTM is needed to clock the DACs. Not having it there defeats its purpose entirely.
I am a bit worried by the level of testing and integration that this (new) WR implementation will have seen (both in hardware and gateware/software) before we paste it on three boards roughly simultaneously. If we can't arrive at a trialed and tested design with integration at the PCB-level and the gateware/software level early enough, then we should at least invest significantly more time and resources into cleaning up, documenting, testing, reviewing, and integrating the current prototype.
@jordens that's fair.
The hardware has been quite extensively tested using the KC705 and DCXO evaluation boards. Phase noise and stability plots have already been posted. We've verified that it reliably acquires lock and maintains if over several days without issues (scope on infinite persist triggered from reference and monitoring recovered clock). I can provide additional documentation of the hardware configuration, for example by giving a list of FPGA connections used to implement WR if that helps.
The gateware/software for this was straight verilog + xilinx cores e.g. for I2C. A migen/misoc port is underway at the moment. FWIW, I can't think of any issues with Artiq integration that could be major showstoppers, beyond the usual requirements for transceiver/fabric clocking, but there is always a risk that we've missed something.
The gateware is mainly just a straightforward implementation of the CERN WR cores. The changes are:
One thing that does need doing is pulling all the data scattered across various places into a write up. We'll do that, but might not manage it before the Sayma deadline which is quite tight.
The pitiful results of some DDMTD attempts on the Ultrascale FPGA makes me think that we have to be careful about how we time signals with FPGAs. We cannot expect to route time-critical signals without precautions all around a large, noisy FPGA and get good performance.
So we should make sure that all time-critical paths stay in the same I/O bank and also signals that go to MMCMs, BUFGs, etc. are using the correct GC pins.
Some more radical options:
@sbourdeauducq thanks for the updates, that's helpful.
DDMTD on Ultrascale with one clock from the GTH and one clock from an I/O bank exhibits high peak-to-peak jitter and unexplained intermittent skews that are not corrected by averaging, when the FPGA is loaded by integrating SAWG. This renders it pretty much unusable in this configuration, unless workarounds are found (discussing with Xilinx).
@WeiDaZhang correct me if I'm wrong, but that's a bit concerning isn't it, since this is the configuration that we have to use for clock recovery on Sayma (since one clock originates in the transceiver).
We did a lot of testing on the Kintex 7 and didn't see effects like this, but have no experience with ultrascale FPGAs.
So we should make sure that all time-critical paths stay in the same I/O bank and also signals that go to MMCMs, BUFGs, etc. are using the correct GC pins.
I think the proposed implementation of DDMTD-based clock recovery on Sayma does this as far as possible (although if you see room for improvement then let me know!).
connect a AMC backplane transceiver pair to the RTM GTP (through the AMC and RTM connector), so we can bypass the Ultrascale FPGA for all time-critical functions if needed (i.e. run DRTIO directly from Metlino to Sayma RTM).
add a small FPGA to Metlino that handles time-critical functions, like managing the WR PLL.
The idea of adding timing FPGA on Metlino is potentially quite a nice one, might be worth trying to keep that option open as a future direction. I'd really like to build a version of Kasli with DDMTD + DCXO asap so we can develop the concept and give it a proper test in a simpler system before we make too many decisions about what to do on Metlino.
What about systems where we don't use Metlino, and control Sayma via SFP (this will be our primary use case for the foreseeable future)? In that case, I don't see an obvious remedy to avoid the ultra scale FPGA.
@WeiDaZhang correct me if I'm wrong, but that's a bit concerning isn't it, since this is the configuration that we have to use for clock recovery on Sayma (since one clock originates in the transceiver).
FWIW, Ultrascale DDMTD between the raw RX recovered clock and the RTIO clock seems to be stable (I used it to look at the Sayma "180 degree" bug, and also to validate the new siphaser code). DDMTD between the RTIO clock and IBUFDS_GTE3 ODIV2 also seems stable. Might be something about clock routing from the transceiver to the IOB.
Anyway, please test DRTIO between TTLs and then we will see what the system performance is like with Ultrascale.
One thing Xilinx suggested is to disable clocks to most of the design while measurements are being made; we can use that for SYSREF alignment but not the WR PLL.
I added WR to the RTM side. Anything else do to in this issue?
Nope
@gkasprow would it be possible to add the same WR circuitry to the RTM that we have on the AMC? My long term aim it to be able to generate a high-stability DAC clock from the WR clock (via the AMC FPGA).