sinara-hw / Kasli-SOC

Xilinx ZynQ(R) version of Kasli FPGA controller.
6 stars 6 forks source link

Zynq Kasli #1

Closed dtcallcock closed 4 years ago

dtcallcock commented 5 years ago

A new Kasli based on the forthcoming v1.2 design, but with a Zynq-7000 series FPGA (2x ARM CPU) instead of an Artix-7 (no hard CPU).

This will allow the proposed Zynq version of Artiq to utilise the EEM hardware.

The proposed FPGA is the XC7Z030-3FFG676I. The reasons for choosing this FPGA (as laid out by @hartytp) are:

SDRAM would be routed to the ARM subsystem only. The gateware will still have DMA access based on tests carried out by @cjbe, where he found that:

I have tested DMA from PL to PS-SDRAM via the HP ports, and indeed it works without any problems, and one gets the advertised bandwidth.

I am working on getting this board funded and would appreciate any input.

Also, is 'Zynq Kasli' a confusing name? Do we need to dust off our Russian maps and come up with something more unique?

gkasprow commented 5 years ago

Maybe this time the name that says something? Something like ZynTroller or ZynQer

sbourdeauducq commented 5 years ago

"Kazli" is probably too confusing. Maybe "Fastli"?

gkasprow commented 5 years ago

or "Fastler" like fast controller

hartytp commented 5 years ago

Personally, I think that names like "fastino" and "fastli" make an already somewhat confusing naming situation somewhat worse. Maybe let's try to come up with something that's not just a mutilation of "Kasli"?

sbourdeauducq commented 5 years ago

Then we can call it Kutashi, after a lake near Kasli :)

gkasprow commented 5 years ago

It sounds not professional in polish. Kutas = dick :)

sbourdeauducq commented 5 years ago

There is also Lake Kirety

hartytp commented 5 years ago

My current priority list re naming:

  1. Don't get drawn into a lengthy argument about naming
  2. Don't build more hardware with Russian names many people find hard to pronounce and remember
gkasprow commented 5 years ago

A few ideas from acronym creator SECON Soc Eem CONTROLLER SECCo Soc Eem COntroller SCOT Soc COnTroller SERAC Soc EuRocArd Carrier SEACOT Soc EurocArd COnTroller ZECCA Zynq EuroCard CArrier ZECI Zynq Eem Carrer

marmeladapk commented 5 years ago

I would vote for Kasli Zynq (KasliZ for short). It's clear what this board is and it's similar naming scheme to AFC/AFCK/AFCZ.

gkasprow commented 5 years ago

Let's talk about specification of the module.

sbourdeauducq commented 5 years ago
* we won't need RGMII and PHY, and will use MGT for Ethernet acces as it is now in Kasli.

@hartytp wanted to use PS Ethernet for performance reasons (it is possible to have the same performance out of a fabric Ethernet core since Ethernet frames are long and things can be pipelined, but the interfacing with the ARM core is most likely annoying). And the ZC706 will use the PS Ethernet already.

sbourdeauducq commented 5 years ago
* can we use PS GPIOs for SFP control?

I think so. Same for low-speed I2C that controls LEDs and such.

gkasprow commented 5 years ago

@sbourdeauducq I wouldn't use PS I2C. It has so many bugs that is useless. We usually instantiate one in PL. It does not matter if you use MIO RGMII or EMIO GMII + PCS/PMA. GEM is the same. But second case requires Xilinx PCS/PMA which instantiates MGT. So external PHY can actually make life from SW point of view far easier.

sbourdeauducq commented 5 years ago

I wouldn't use PS I2C. It has so many bugs that is useless.

Yeah sure, this is standard fare with Xilinx cores and I had no intention to use that. We can do bit-banged I2C on GPIO, unless that core is buggy too.

Are you saying that the PS can use a GT transceiver directly for Ethernet, without the Ethernet logic in the fabric?

gkasprow commented 5 years ago

This is true in UltraScale+ ZynQ - GTR can be used that way. In ZynQ you have to instantiate PCS/PMA logic between EMIO GMII and MGT.

gkasprow commented 5 years ago

It should fit easily obraz

gkasprow commented 5 years ago

If we want to make life of @sbourdeauducq easier (no logic between GEM EMIO GMII and MGT) and use external PHY chip that talks to SFP directly, it means that this SFP would have to be used for Ethernet, without DRTIO. We can also use 1000Base-T connector and spare a few bugs on SFP cage and SFP-RJ45 transceiver.

sbourdeauducq commented 5 years ago

If the Xilinx stuff can give a GMII interface to the fabric without bugs and quirks, then using the GT solution is fine (except, of course, for the usual problems associated with the poor design of Xilinx transceivers, and in particular the GTP/GTX incompatibility - so it's not "free").

* We can add EEM protection against damage caused by too high voltage on EEM. But it will add some cost.

In addition to the signal degradation and component cost, this will also make the board layout and assembly more complex. Kasli systems are normally assembled in a controlled environment where ESD can be avoided, and then the metallic enclosure should protect against ESD. So I would not add those diodes.

gkasprow commented 5 years ago

That's why I developed simple adapters that ensure such protection during tests only. It is to avid FPGA death in case of LVDS shorted to 3.3V. 2 Kasli died during tests. And Tom asked if it makes sense to add to all boards, in my opinion it doesn't.

gkasprow commented 5 years ago

edit - LVDS shorted to 3.3V

jordens commented 5 years ago

IMO we should consider the first steps in the transition towards the cPCI-serial backplane here and now. I.e. move all the parts into the 160 mm towards the panel and move the EEM connectors towards the back already now. Then going to the cPCI-serial backplane would amount to shortening the board and using different connectors.

hartytp commented 5 years ago

And Tom asked if it makes sense to add to all boards, in my opinion it doesn't.

Don't bother with it.

hartytp commented 5 years ago

IMO we should consider the first steps in the transition towards the cPCI-serial backplane here and now.

@jordens how would you want to use one of those backplanes? We've talked about this a few times, and I don't see any particularly good way of hooking Kasli/EEMs up to a BP. (The issue being how heterogeneous the EEMs are in terms of size, number of EEM connectors required etc). AFAICT, while a BP might work okay in some circumstances, in general the issues with a BP are at least as bad as the issues with using ribbon cable. Although, maybe I'm missing something?

jordens commented 5 years ago

You have to bite the bullet and a) make all cards the same length, but I am pretty sure that PCB area is not that costly and this is minor b) figure out fixed panel widths (when doing custom backplanes) or live with 4 (?) HP (which would only affect DIO_BNC and Sampler right now). That might not be so bad and adding BNC multipin pigtails or moving to SMA could be done. c) figure out how to deal with the double-EEM connector EEMs. Right now the double-EEMs would need to go to the two 8 lane PCIx slots and the single EEMs would go to the six 4 lane PCIx slots when using existing backplanes. But we should also consider redesigning these EEMs to single-EEM connector.

I don't think any of those are particularly bad or insurmountable constraints compared with the constraints of ribbon and coax cables. But this would likely be an all-or-nothing change (unless we figure out a way to build some adapter from the backplane to IDCs. In any case, moving the components to the front and the IDC connectors to the back (even in two rows) sounds like a no brainer to me.

hartytp commented 5 years ago

@jordens yes. FWIW I really like the current flexibility in the EEM system re board sizes/shapes and IO requirements. Loosing some of that flexibility is likely to be one of the costs of moving to a fixed BP.

It's been a while since I thought about this, but IIRC the PCIx BP connectivity might not be ideal for us.

But, yes, none of the objections is insurmountable. All I'm saying is that -- as someone who was originally arguing that we should try to implement a EEM/Kasli BP -- having thought about this a bit, the tradeoffs/costs involved in the BP seem to me to outweigh the benefits. Having put together quite a few systems here, I actually don't think the ribbon cables are that bad. Others may disagree.

Anyway, let's not get sucked into a long discussion about this now. Before we have that discussion, someone would need to present a fully fleshed out proposal and I don't think there is the interest or time to do that at the moment.

In any case, moving the components to the front and the IDC connectors to the back (even in two rows) sounds like a no brainer to me.

Sure, if this doesn't cause issues with the routing/mechanics/SI/PI/thermal management/etc and if it doesn't consume an unreasonable amount of @marmeladapk's time then why not.

hartytp commented 5 years ago

The only other thing I'd add is that if we really want a BP then we should at least consider moving some of these designs to AMCs. The racks, power supplies and cooling are good and relatively inexpensive for what they are.

I know that the experiences with Sayma have been a bit depressing so far, but I do not think that's a fundamental issue with uTCA, so much as a result of various aspects of the approach taken in that project.

jordens commented 5 years ago

Ultimately the ribbon cable collection and the coax cables are a hack. A very pragmatic and convenient one in the beginning. And I think we chose wisely at that time.

Yes. I do like uTCA for what it achieves. But the racks, power supplies, and cooling are also good for cPCI. To break even with the uTCA complexity, much more powerful eems would be needed.

gkasprow commented 5 years ago

cPCI serial seem to fit out needs ideally. The only drawback is card count - the controller supports only 8 slots. Controller slot has 8 x PCI Express 6 x4 links 2 x8 links (fat pipes) 2 dedicated I²C high-speed buses for the fat pipes Optional serial RapidIO 8 x SATA/SAS Supported by SGPIO bus (SFF-8485 specification) for hot swapping 8 x USB 2.0 8 x USB 3.0 8 x Ethernet 10GBASE-T which gives us another 8 LVDS links

so potentially we have 16 LVDS links between controller nad every module with standard backplane. There is also full mesh 4xdiff connection between every module. The front panels are spaced by 4HP, so mechanics would become the same. The only problem is how we solve the issue with single and double EEM and 4 and 8 HP panels. 8 slots should not be a problem because in most cases thare are boards that consume 2 EEM. Panel size should be not a big issue once we switch to SMA/SMB/SMC instead of BNC. I see a few options options:

gkasprow commented 5 years ago

What we can do is to assign dual LVDS ports to slots 2, 4, 6, 8 by default. Slots 1, 3 and 5, 7 would be connected with bus crosspoint switch to 4 FPGA EEM ports. Such bus crosspoint switch can assign 2 EEMs to slot 1 or one EEM to slot 1 and one EEM to slot 3. In this way user can insert 8 HP dual EEM boards to slot 4, without loosing slot 3 because its signals will be redirected to port 1. so slot assignment could look like this: slot 1: single EEM + optional signals from slot 3 slot 2: dual EEM slot 3: optional EEM slot 4: dual EEM slot 5: single EEM + optional signals from slot 7 slot 6: dual EEM slot 7: optional EEM.

In this way we minimise amount of not used EEM slots. We can i.e. pack 6 Urukuls or 2 Samplers and 4 Urukuls and we don't loose single EEM port.

Octal LVDS bus switches are cheap.

gkasprow commented 5 years ago

If we transit to cPCI, we can make simple cPCI to EEM adapters so exiting users could use new boards with ribbon cables. Opposite situation would be hard to achieve because cPCI defines length of 160mm.

jordens commented 5 years ago
jordens commented 5 years ago

Let's break out the cPCI backplane discussion. @gkasprow could you move https://github.com/sinara-hw/sinara/issues/233 over to sinara-hw/meta ? I don't have permissions.

gkasprow commented 5 years ago

done

gkasprow commented 5 years ago

I talked with CERN guys and they will have very good step-pricing for ZynQ US+ devices. This means that every company that mentions that the chips are for this project will get a discount. They aim for 7EV in 1156 ball package. The price will be much lower than for 7Z7045 in 900 ball package. Moreover, they got a message from Xilinx that within 2 years they will change 7-series chips status to mature and increase prices to force users to switch to newer series. They want to fund the design of the EEM controller for their DIOT system based on this ZynQ US device. We are already designing open source CPCIS backplane using Kicad. The basic configuration will have ZynQ US chip, one SFP cage, WR oscillators and FMC LPC slot. We can use it or modify to fit our needs. There is a discussion about doing design in Kicad. The issue is that Kicad needs 2..3x more time to do the same design than Altium. This could be solved with a suitable file converter.

dtcallcock commented 5 years ago

Moreover, they got a message from Xilinx that within 2 years they will change 7-series chips status to mature and increase prices to force users to switch to newer series.

How much is this price increase likely to effect the cost of a Kasli? Are we talking $10s, or a doubling of the board price?

Also, are we get hit by availability issues at the same time?

They aim for 7EV

If we went with this, could the 6-core (inc. 2 real time cores) architecture actually be useful for improving Artiq performance in any way or is it just a ton of unnecessary complexity?

gkasprow commented 5 years ago

The only info I got was that the price will be significantly higher. In case of Kasli it would be probably a few tens of $.

hartytp commented 5 years ago

What are US+ devices like? Are they US without the bugs? There was a big gap from 6 to 7 and from 7 to US, so I wondered if the shortish gap between US and US+ implies more of a big fix family?

gkasprow commented 5 years ago

It is simply a faster version due to the smaller silicon process. Some chips are pin compatible.

hartytp commented 5 years ago

Okay, so still the same IOB etc design that @sbourdeauducq loves...

dhslichter commented 5 years ago

If the price difference is a few tens of $ per FPGA, I would argue that we are price-insensitive and need to make the decision based purely on the quality/reliability of the parts. If the Zynq-7000 devices are superior to the Zynq-US+ in terms of our ability to use them for Sinara (bugs, I/O, etc), then I see zero reason why we would want to change to Zynq-US+ when the price shifts.

cjbe commented 5 years ago

Indeed - spending $10 extra per FPGA to avoid having to fight more Sayma-style US bugs is a price I am very happy to pay. (Especially considering the software effort/cost/morale hit to fight these bugs can be considerable)

dtcallcock commented 5 years ago

Sure the price is no biggie, but how soon after these price hikes does availability become a problem (especially if we want a specific speed grade) @gkasprow ?

gkasprow commented 5 years ago

Availability should be not a problem. They still sell Virtex 4 FPGAs. Usually, they give more than 10years product life because many producers use FPGA to mitigate long term availability issues.

gkasprow commented 5 years ago

What if could get ZynQ US+ 7CG/EG in 1517 package at a very good price? Such Kasli board would be slightly more expensive than the existing one.

dhslichter commented 5 years ago

Price is basically irrelevant if functionality/bugginess is compromised - the cost of buggy hardware and/or of extensive development work to fight bugs is enormous. Therefore, until it is shown otherwise that ZynQ US+ doesn't have the same problems we have seen with Kintex US, I think that the Zynq-7000 series is where we should stay.

dhslichter commented 5 years ago

Other considerations: ZynQ Ultrascale hardware won't support 2.5V or 3.3V I/O standards, which makes it much more of a hassle to interface with all the hardware we typically want to talk to (DDS, DAC, ADC, etc). We would have to put a bunch of level translators either on the Kasli (pain, and little space) or put them instead on all the EEM peripherals (pain, plus breaks back compatibility). There may be other solutions, but in general this alone is a substantial reason for me to want to avoid Ultrascale when possible. The Kintex Ultrascale has some HR banks that allow these standards, but not as many as in the Kintex 7.

gkasprow commented 5 years ago

In terms of price, the goal is to have the entire controller board below 1.2k. We could simply reuse work that CERN does for their CPCIS DIOT system. They will use Zynq US+ and we can use the entire solution or not. In terms of IO, we use LVDS to mitigate IO standard issues. LVDS has still the same voltage levels in 1.8, 2.5 and 3.3V systems. Sayma uses 1.8V bank supply for LVDS and there is no issue with connecting EEM boards via VHDCI interface.

sbourdeauducq commented 5 years ago

We could simply reuse work that CERN does for their CPCIS DIOT system. They will use Zynq US+ and we can use the entire solution or not.

If the board is already there anyway, we can give it a try and see if there are roadblocks to running ARTIQ on it.

gkasprow commented 5 years ago

It's not ready yet. I'm working on it right now and want to make sure it will also cover our needs.