sinara-hw / sinara

Sayma AMC/RTM issue tracker
Other
42 stars 7 forks source link

Sayma v2 Project Plan #601

Open jbqubit opened 5 years ago

jbqubit commented 5 years ago

Here's a coherent project plan for Sayma v2. It is informed by extensive discussion on github Issue system over the past year and recent offline discussion with the Sayma v1 developers. This is posted here so everyone knows what the contracts call for in development of Sayma v2. This plan will be updated if contracts differ from the present plan. Please use existing or new Issues for further discussion of technical topics.

This plan defines concrete performance goals, specifies which parties are responsible for specific deliverables and defines a project management structure. The aim is to provide sufficient specification and structure at the outset that all parties feel comfortable with their responsibilities and minimizes the role of ARL/UMD in project management.

The plan includes design choices that simplify Sayma and lower costs. The github Sayma Issue tracking contains a long list of bug fixes and modifications to v1 (https://github.com/sinara-hw/sinara). I call out explicit specifications below in cases where there is not a clear github consensus on how to proceed with Sayma v2. The plan is not meant to substitute for the Issue system except where which-path clarity is required in which case this plan is authoritative. I will annotate relevant github Issues with a link to this Issue so there is clarity about which-path.

Definitions

Let TS-MTCA be a micro TCA crate configured as follows:

Let TS1 be a Sayma test setup consisting of the following

Let TS2 be a Sayma test setup consisting of the following

Let TS7 be a Sayma test setup consisting of the following

Sayma v2 Diffs

When diff also applies for Metlino it is noted.

RTM Synthesizer

AMC clock recovery

Discussed in meta/issues/15. Also applies to Metlino.

DAC-FPGA Synchronization

Deterministic phase alignment between DACs on separate Sayma PCBs between power cycles is an important design criterion and should be considered at all stages of design. This implies a) SYSREF is aligned with RTIO clock b) DAC clocks are synchronized to local SYSREF edges.

DAC Performance

DAC Phase Stability

Phase stability between DACs on separate Sayma PCBs is an important design criterion and should be considered at all stages of the design. This assumes that DAC-FPGA synchronization.

DAC Noise

Low cross talk between adjacent analog channels on same AFE is an important design criterion and should be considered at all stages of the design (particularly layout).

Phase noise at 400 MHz (3 dB margin on AD9154 specification):

Reduce complexity of Sayma v2

AFE Interface to RTM

BaseMod AFE v2

Smart Agile Waveform Generator

Use SAWG v1 with minimal modification to demonstrate Sayma v2 hardware. SAWG v2 will be pursued once Sayma v2 hardware is working.

SAWG v1

SAWG v2

Ethernet


It is anticipated that there will be three developers working together. Responsibilities and interfaces between developers are discussed in this section. UMD/ARL will serve as Technical Point of Contact (TPOC).

Hardware Developer

The Hardware Developer is responsible for hardware design and manufacturing. Testing responsibilities of the Hardware Developer are those detailed in HT3. All deliverables include bi-weekly written progress reports. The Hardware Developer has four deliverables HT1 to HT5.

HT1 high level design

High-level schematics for all PCBs to include component part numbers, high-level component interconnection, power budget, high-level layout of all PCBs including location of ICs, board-to-board headers, RF shielding, heat sinks, mechanical support and front panel design. Post design files to github and tag. Participate in design review on github.

Layout-related considerations:

HT1 delivery is complete when the Software and Gateware Developer, Integration and Timing Developer and UMD/ARL sign off on the drawings.

HT2 detailed design

Detailed schematics for all PCBs, finished component interconnections, signal integrity analysis, thermal analysis, power analysis, final PCB routing. Post design files to github and tag. Participate in final design review, accept delegated responsibilities from Software and Gateware Developer Design. Output is layout ready for manufacture.

Hardware Developer shall lead HT2 design review. This design review is in collaboration with scientists and other developers. The review will not be comprehensive. It will focus on differences between v1 and v2 of the hardware. Shall include validation of FPGA pinout.

HT2 delivery is complete when Software and Gateware Developer and Integration and Timing Developer sign off on the drawings.

HT3 manufacturing, baseline testing

Fabricate PCBs in the following quantities. TPOC provides sign off prior to commencement of manufacturing.

Expectations for testing of all stuffed PCBs.

Distribute hardware to system integrators:

HT3 delivery is complete when the Software and Gateware Developer and Integration and Timing Developer confirm receipt of hardware.

HT4 documentation and MMC

HT4 delivery is complete when documentation and source is committed and accepted by TPOC.

HT5 delivery to UMD

Build demonstration system that is fully assembled and tested. The system shall consist of the following components.

HT5 delivery is complete when UMD confirms receipt of the demonstration system /and/ TPOC successfully runs test code. TPOC has 7 business days to conduct tests.

HT6 round up

Integration and Timing Developer

An Integration and Timing Developer is responsible for hardware integration and testing. This developer has one deliverable OT1. Bi-weekly written progress reports.

OT1 integration and timing

OT1 delivery is complete when the Integration and Timing Developer completes and documents all tests O11 to O18.

Software and Gateware Developer

The Software and Gateware Developer is responsible for gateware and software needed to provide ARTIQ support and to implement SAWG. All software and gateware development items include:

The developer’s work consists of three deliverables MT1 to MT3. An asterisk(*) indicates previously funded work. All deliverables include bi-weekly written progress reports.

MT1 Sayma v2 Planning

MT2 Sayma v2 ARTIQ Support

MT3 Sayma v2 Extension and Other


Project Organization

ARL/UMD will fund and manage contracts with @jbqubit serving as technical point of contact (TPOC). In the event of a dispute about interpretation of a contract or about contract execution, the TPOC will have project-level decision making. Note that the legal language in the officially executed contract documents has final authority.

New github repositories were setup for each component of the Sayma system.

Github Issue tracking will be the authoritative location for test results and bugs. Keep discussion via other channels (email, IRC) to a minimum but when it happens post a summary to the relevant github Issue. Link back to the old sinara-hw/sinara Issues as needed but don’t conduct new discussions in sinara-hw/sinara.

Weekly discussions via google hangout during period of performance. Attendance of at least one team member for each collaborator is expected. Discussion limited to 30 minutes per week. Software and Gateware Developer, Hardware Developer and Integration and Timing Developer. UMD/ARL involvement if requested by MC. Responsibilities:

hartytp commented 5 years ago

problem: unsynchronised dividers, fix: Make FPGA measure the output phase and reset the HMC830 divider until synchronised.

In the first instance, the fix should be to not use the HMC830 output dividers, and just run the DAC at > 1.5GSPs using interpolation.

Using the FPGA to synchronise the PLL output dividers is mainly a risk mitigation strategy in case there is some issue with interpolation (e.g. DAC synchronisation doesn't work at max clock rate due to a silicon bug).

hartytp commented 5 years ago

f_dac_clk = 2.0 GHz (4X interpolation) with following expected implications:

Typo, should be 2x.

hartytp commented 5 years ago

BaseMod v2: print 20 PCBs, stuff 12

We should aim to distribute test AFEs as well which just run the DAC outputs to SMAs via baluns. This gives a path to testing DAC issues without having to configure RF switches, attenuators etc, which might be helpful in initial debugging.

hartytp commented 5 years ago

M32 support Si549 as stand-in replacement for Si5324 (no White Rabbit time synchronization in this contract)

What's this one about? The SI549 isn't a replacement for Si5324 without WR.

sbourdeauducq commented 5 years ago

Using the FPGA to synchronise the PLL output dividers is mainly a risk mitigation strategy in case there is some issue with interpolation

I would prioritize getting the alternative chip to work over this.

hartytp commented 5 years ago

TBH I'm with you on that. I'm not the one paying the bill however, so that's of limited importance...

hartytp commented 5 years ago

@jbqubit I'm confident that with, say, 1-1.5days of time I could write and test the firmware required to use the ADF4356 on Sayma. If it works off the bat then it has the potential to save a significant amount of time spent mucking around with the HMC830. If after that amount of time it's still not working then we stick with the HMC830.

I don't know how SB feels about this, but I'd suggest that from a project management pov it makes most sense for me to do this firmware development. The code is fairly simple (no more complex than the HMC830 code I wrote), and a lot of the time is spent looking at PLL outputs with T&M equipment (which I have easy access to).

jbqubit commented 5 years ago

@hartytp Project Plan addition: added to HT3 TestMod v2: print 6 PCBs, stuff 6 (DAC outputs to SMAs via baluns)

@hartytp Project Plan addition: O18 Integration and Timing Developer responsible for writing and testing ADF4356 driver.

M32 support Si549 as stand-in replacement for Si5324 (no White Rabbit time synchronization in this contract)

What's this one about? The SI549 isn't a replacement for Si5324 without WR.

Project Plan modification: M32 is now crossed off since WR support is not covered by the plan.

jbqubit commented 5 years ago

Project plan modifications:

M12 design study to support 1 GSPS data rate on Sayma v2 ---> M12 design study to support 1 GSPS data rate on Sayma v2 (SAWG v1)

M13 design study to investigate SAWG v2 ---> M13 design study to investigate SAWG v2 (eg limit f_f0 resolution, reduce RTIO channel number, Moninj)

M15 lead HT1 design review ---> M15 participate in HT1 design review

M24 test AFE BaseMod v2 (excluding ADC) ---> M24 test AFE BaseMod v2 (including RF switches and attenuators, excluding ADC)

Addition of new task: M28 extend RTIO support to FPGA on RTM

M31 ---> M31 support code developed in O18

M37 support reduction of data rate for FTDI chip ---> removed

Addition of new task to HT3:


HT2 design review modifications

jbqubit commented 5 years ago

Minutes for last two discussions:

jbqubit commented 5 years ago

Modifications to Project Plan in light of email from Creotech.

TS-MTCA modifications

HT3 modification

HT4 modification of formal hardware documentation maintained in github

Comments on question in Creotech email.

Should the ARTIQ Python code tests belonging to HT5 really belong to the tasks of the Hardware Developer?

Yes. It is crucial that the Hardware Developer demonstrate expertise in using ARTIQ and Sinara together. Sayma v1 testing was handicapped by not having such expertise in Warsaw. In particular, there were hardware bugs that didn't emerge until the full ARTIQ system was running on the Sayma v1 hardware.

It seems that these tests depend more on the status of GW/SW than the hardware. If problems with GW/SW arise, how should they be addressed?

I understand that GW/SW problems may cause problems for delivery of HT5. Indeed, if hardware problems arise it is becomes difficult for the Integration and Timing and Software/Firmware Developers to deliver certain contract items. For Sayma v1 there was no contract item that made Warsaw dependent on GW/SW Developers. Now it is explicitly built-in to reflect the co-dependencies of the project. In contract bids, I anticipate that each of HT1 to HT6 will be a separate line item. Each line item can be billed for independently. So, if HT5 is slowed due to GW/SW problems, the remaining HTx could still be billed.

jbqubit commented 5 years ago

@hartytp asked

who is going to lead the design review? By "lead", I mean: drawing up a list of things that need review (e.g. based on the CERN check list); organising who will be responsible for reviewing each aspect of the design; chasing people up to ensure that they perform their parts of the design review and sign off on the design; having the final sign-off on the project before it goes to manufacture. This needs to be done by someone who is following the technical aspects of the project closely.

Creotech will manage the design review. As a starting point @marmeladapk and @jbqubit are taking a list of v1 to v2 diffs supplied by @gkasprow and generating a list of specific areas that need review. We are also making an explicit list of checks based on CERN checklist. Will be sent out as RFQ tomorrow. I'll add @hartytp to the list of editors now.

hartytp commented 5 years ago

@gkasprow thanks for posting the new Sayma schematics! Great work.

@marmeladapk apologies if you're already doing this, but just to be clear: before we start doing individual design reviews, someone should check that all the open Sayma issues have been resolved. Do you want to do this or should I?

marmeladapk commented 5 years ago

@hartytp If you check design changes table, which Joe shared with you, you should see that all open issues from milestone are added there. I'm going now through all open issues with no milestone and adding them if necessary. Then I guess I should also check closed issues, though we could share the workload if you'd be willing to.

how do you want to handle these issues? Do you want to leave them open until someone has done an independent design review and signed off that the changes were correctly implemented?

I think that person that has been assigned to check an issue should give a green light to close it. Another way is to ask @gkasprow to close all issues implemented, that are in Sayma v2.0 milestone and rely solely on design review table, that we are making right now.

If so, someone should also do the thankless job of checking that all the closed issues really have been implemented.

That's true, but most of them should be obsolete.

marmeladapk commented 5 years ago

I went through:

jbqubit commented 5 years ago

Thanks @marmeladapk !!

jbqubit commented 5 years ago

From https://github.com/sinara-hw/sinara/issues/602#issuecomment-446916631 I've changed the following.

ADF4351 --- > ADF4356

jbqubit commented 5 years ago

@hartytp pointed out that his synchronization patch O11 only relates to ST4. Software and Gateware Developer are best situated to handle ST1 and ST2 which depend on DRTIO. So some changes...

M23 implement and test ST2 and ST4. Software and Gateware Developer

--> M23 implement, test and document ST1 and ST2 which depend on DRTIO; add to continuous integration

O11 patch to implement DAC-FPGA synchronization

--> O11 implement DAC-FPGA synchronization with 2 GHz DAC clock (1 GSPS data rate).

O14 responsible for building, conducting and documenting tests for ST1 and ST3

--> O14 Based on O11, demonstrate ST4 for a pair of AD9154 DACs on a single Sayma v2 board in configuration TS1.

jbqubit commented 5 years ago

O15 measure DAC phase stability O16 measure DAC phase noise

O15 and O16 moved out of Project Plan and assigned to individuals in github. https://github.com/sinara-hw/Sayma_RTM/issues/1 https://github.com/sinara-hw/Sayma_RTM/issues/2

jbqubit commented 5 years ago

Edit to make explicit a change that will prevent https://github.com/sinara-hw/sinara/issues/571. Heads up @marmeladapk.

Add to HT3:

jbqubit commented 5 years ago

Modification to MT1 based on email with @sbourdeauducq.

hartytp commented 5 years ago

RTM DRTIO -- already funded, can be prototyped on Sayma v1 or Kasli

What does prototyping this involve? IIRC, it's already prototyped in the sense that DRTIO between Sayma AMC and Kasli already works, but we can't directly prototype DRTIO with the RTM FPGA on v1.0 HW since there are no AMC<->RTM transceiver links...

sbourdeauducq commented 5 years ago

Use a SATA cable. I'll work on that after I get back from Xmas holidays.

hartytp commented 5 years ago

Ace, thanks!

sbourdeauducq commented 5 years ago

Working fine except for https://github.com/m-labs/artiq/issues/1230

sbourdeauducq commented 5 years ago

Edit to make explicit a change that will prevent #571. Heads up @marmeladapk.

Add to HT3:

* For TS-MTCA with NATIVE-R5, publish MCH configuration configuration file used for these tests.

That's a misunderstanding of the major issue behind #571 and what is required to prevent it. I am running with the default configuration files. The minor MCH configuration issue did not cause the initial loss of power with the RTM.

jbqubit commented 5 years ago

Everyone, please review a draft update to the project plan at the following URL.

https://github.com/jbqubit/saymapp

The number of changes has gotten numerous enough that git is useful for tracking. The present revision reflects feedback from all prospective Developers. The jbqubit repository is temporary; once everyone agrees upon changes I'll update this Issue.

hartytp commented 5 years ago

@jbqubit do we still need to keep this issue open, or can we close it?

hartytp commented 5 years ago

@gkasprow can you let me know when the amc and rtm schematics are ready for review?

gkasprow commented 5 years ago

Yesterday I published a new release for both of them.

hartytp commented 5 years ago

Okay, thanks.

I didn't start reviewing yet as there are still some outstanding issues. If you confirm that all schematic issues (apart from finalizing the HMC830 loop filter OpAmp) are resolved then I will begin my review.

hartytp commented 5 years ago

@jbqubit @marmeladapk can you confirm how you want us to structure the review/who is responsible for what?

gkasprow commented 5 years ago

Yes, I confirm.

hartytp commented 5 years ago

Thanks Greg.

@jbqubit please can you confirm how you want us to do the review. AFAICT the google spreadsheet you circulated before is now out of date. Is there anything in particular you need me to do? Otherwise I'll review the bits I feel qualified to look at and leave the rest.

hartytp commented 5 years ago

I've completed my review of Sayma AMC. I didn't find anything major. I'll aim to finish the RTM tomorrow.

jbqubit commented 5 years ago

@hartytp Thanks for doing the review of AMC. It will help others to know exactly what you reviewed. Please update the spreadsheet to reflect what you checked. Make modifications to the columns if there's something that warranted review but wasn't forseen in the review structure.

@sbourdeauducq Is your AMC stub port complete? Can you articulate what aspects were covered by that port using the google spreadsheet? If not please suggest an alternate format for communicating what aspects of the design you checked.

hartytp commented 5 years ago

@hartytp Thanks for doing the review of AMC. It will help others to know exactly what you reviewed.

Yes, but it would have helped me if you'd responded to any of the times when I asked you to confirm what you actually wanted me to do before I completed the review...