sinara-hw / meta

Meta-Project for Sinara: Wiki, inter-board design, incubator for new projects
50 stars 4 forks source link

DIOT #84

Open jordens opened 6 months ago

jordens commented 6 months ago

Since the topic comes up frequently and I couldn't find a place to tack this onto, here is an unsorted collection of things we noticed while considering moving away from EEM towards DIOT crates.

Problems we're facing with the current legacy Sinara EEM style:

DIOT downsides:

The downsides drive and reinforce the need for:

marmeladapk commented 6 months ago

Debugging/deployment/direct JTAG-flashing a card requires an adapter/extender (Kasli, Kasli-SoC)

Deployment and flashing can be done by a controller when it asserts SERVMOD for a peripheral slot. Debugging is more tricky, perhaps it could be done with BSCANE2 and routing of peripheral JTAG signals inside the FPGA?

gkasprow commented 6 months ago

@jordens do you think it would be valuable to equip each board with I2C power monitor? Or just read the load from the power supply

jordens commented 6 months ago

For development, deployment, debugging etc maybe incremental total supply load is sufficient to infer per-board data.

gkasprow commented 6 months ago

But this assumes a per-board power on mechanism controlled by Kasli. Currently the power on moment is dependent on the slot number.

gkasprow commented 6 months ago

More expensive (backplane, connectors, receptacles vs IDC cables): would conceivably increase initial cost of a typical crate by 10% (? TBC)

This needs confirmation because one has to take into account existing overhead with wiring and debugging EEM

Only 8 peripherals per crate (compared to max 12 in theory per Kasli with EEM)

If we develop consolidated HW that doesn't waste slots, that shouldn't be an issue. We can also use higher density, 16-channel DIOs

Fixed board length (220 mm): no short DIO-BNC/SMA boards

The only difference is board cost but that's negligible in most cases

Fixed board width (6 HP): no 8x BNC DIO or Sampler-BNC, no 4+4 carrier+mezzanine

The only downside is the lack of support for 8 BNCs on the front panel. This can be fixed by adapters. In most cases, the mezzanine can be fit into a 6HP area; use lower connectors.

Debugging/deployment/direct JTAG-flashing a card requires an adapter/extender (Kasli, Kasli-SoC)

As Pawel mentioned, that can be solved using the DIOT SERVMOD mechanism. We also developed DIOT debug adapters/riser cards

Pounder/Driver mezzanines would need to become (a) RTMs, or (b) be consolidated with Stabilizer or (c) made thinner to fit into 6 HP (like DIOT/cPCI-S/FMC mezzanines): otherwise, waste a slot

I wouldn't go for RTMs. 6HP is quite a lot of space; I think we will go for Kirdy instead of the Driver; in the case of Pounder, we can replace connectors and will fit. Are there plans to support Pounder in DIOT?

Thermostat controlling Zotino DAC would need to be an RTM (cabling), or become a thinner mezzanine to fit 6 HP, or be merged

We have more panel area, we can expose the TEC connector

Almazny redesign to fit 6 HP

Just replace SMA connectors with edge-mounted ones

Clocker reevaluation to (a) use empty space in crate, (b) merge into Kasli, (c) make more powerful to sensibly occupy a slot, (d) become stand-alone

We already integrated it with CERN DIOT System Board; new Kasli will have same approach

HVAMP32/8 would need to be (a) an RTM, (b) external, or (c) consolidated with their driver, or (d) CPCIs/DIOT/FMC style mezzanine

We have more board area, so we can add DAC to the board. We also have a bigger panel so we can expose the input connector on it.

IDC-SMA/BNC/MCX would need to become RTMs or external VHDCI/HD68 boards (otherwise waste slots)

We already have an HD68 breakout board.

ad-hoc "mods" (e.g. Urukul/Mirny tunable VCO as PLLs) require more planning
provokes instinctive NIH+FUD reaction

Just expose the VCO tuning connector on the panel using the SMA pigtail.

The downsides drive and reinforce the need for:

External break-outs (due to connector density), see [existing](https://github.com/sinara-hw/HD68_BNC_Breakout/wiki) [external breakout](https://github.com/sinara-hw/Banker/wiki) architectures: counter fixed slot positions/reduce slot waste
High-density digital IO from a single peripheral slot to replace the various 8x/16x DIO boards (see Banker, VHDCI): counter peripheral count decrease

As I mentioned above, we have isolated/non-isolated MCX. We can also make HD68 version of Banker

Consolidation of mezzanines onto their carrier (

https://github.com/sinara-hw/Pounder/issues/112, Almazny etc), or redesign mezzanine architecture (more FMC like), or into RTMs, or into CPCIs/DIOT-style mezzanines: to counter fixed slot positions/reduce slot waste

All the mezzanines share the same design flaw - the ground loop. Consolidation is the natural step, especially when we can skip the Ethernet/PoE and free some board area. What's the advantage of using Ethernet in DIOT?

Well-defined cooling architecture/requirements

This was already decided and is part of the DIOT fan tray

Well-defined identification/flashing/JTAG architecture

FPGAs are already covered by remote JTAG (SERVMOD). The same mechanism can be used for microcontrollers. This, of course, needs implementation in Kasli.

Well-defined powering/sequencing/health and status monitoring architecture

The main question is whether the controller should interfere with existing DIOT power sequencing based on geographical addressing.

jordens commented 6 months ago

@gkasprow I think you may have misread my list. I wanted to list the changes that are incurred by DIOT that still require varying levels of work to resolve.

Debugging/deployment/direct JTAG-flashing a card requires an adapter/extender (Kasli, Kasli-SoC)

As Pawel mentioned, that can be solved using the DIOT SERVMOD mechanism. We also developed DIOT debug adapters/riser cards

Let's see how that pans out. E.g. the most current DIOT spec I can find lists those as single ended while Pawel's KU peripheral does LVDS. I would also assume LVDS. But the entire machinery does add quite a bit of logic periphery on PBs and gateware in the SB. I'm aware of the adapters and risers.

I wouldn't go for RTMs. 6HP is quite a lot of space; I think we will go for Kirdy instead of the Driver; in the case of Pounder, we can replace connectors and will fit. Are there plans to support Pounder in DIOT?

On the contrary, IMO assuming Stabilizer and mezzanines/Thermostat etc connect via EEM or DIOT appears to be more of an anti-feature. I listed several reasons above.

We have more panel area, we can expose the TEC connector

IIRC given the tempco data, this also appears to be an anti-feature overall.

As I mentioned above, we have isolated/non-isolated MCX.

I am fully aware. I was stressing that these are the important modules and the problematic ones need to be deprecated.

We can also make HD68 version of Banker

How does that help?

All the mezzanines share the same design flaw - the ground loop.

I'm not sure I understand exactly which ground loop you mean and where that matters (compared to e.g. the ground loop that already exists through panels/crate and backplane/ribbon cables).

Consolidation is the natural step, especially when we can skip the Ethernet/PoE and free some board area. What's the advantage of using Ethernet in DIOT?

The initial idea was to use EEM/DIOT for synchronization and RT comms, Ethernet for wide bandwidth comms with non-RT stuff. But in the absence of a significant use-case and proper specs that connectivity will just linger around, diverge, and cost/confuse.

This was already decided and is part of the DIOT fan tray

Fully aware. That's what's needed.

Well-defined identification/flashing/JTAG architecture

FPGAs are already covered by remote JTAG (SERVMOD). The same mechanism can be used for microcontrollers. This, of course, needs implementation in Kasli.

I'm not sure whether it's practical to do SWD over that channel the way we are used to doing it with a probe. IMO CPUs will either go stand-alone (thus making debugging easy), or we'll use a riser, or we'll just plug it into the crate with the debugger attached. Field deployment will go DFU over USB.

Well-defined powering/sequencing/health and status monitoring architecture

The main question is whether the controller should interfere with existing DIOT power sequencing based on geographical addressing.

I couldn't find a reference or description about this. How does that work? Pawel's DIO MCX and KU PFC just directly connect to P12V0 and on the backplane P12V0 is all common and not sequenced. In any case: If it's the peripheral's job to sequence its power based on GA delay, then it should also measure its own power (if desired).