gatecat / prjoxide

Documenting Lattice's 28nm FPGA parts
ISC License
143 stars 15 forks source link

Litex SoC does not boot on CrossLink-NX-17 with certain clock divisors (works in Radiant) #10

Open danc86 opened 3 years ago

danc86 commented 3 years ago

I'm filing this against prjoxide but I guess there is a good chance the issue actually lies in nextpnr somewhere. I don't know how to narrow it down further, sorry.

I'm working on a Litex design using Vexriscv, targetting the unreleased HPS board with CrossLink-NX-17. I'm working from https://github.com/google/CFU-Playground but currently just with some hacks in a dev branch which you can find here: https://github.com/google/CFU-Playground/compare/main...danc86:spi-debug-on-hps

As of this commit: https://github.com/danc86/CFU-Playground/commit/5d873b8c2accca4c7472a08ddd0cbd73ea7ac864 the design works as expected using Radiant (Synplify), yosys+Radiant, and yosys+nextpnr+prjoxide. When it boots it shows the Litex BIOS console on the UART and you can interact with it.

However when I increase the clock divisor from 6 to 7 (that is, reduce system clock from 75MHz to 64MHz) yosys+nextpnr+prjoxide produces a design that does not seem to boot. I get no output on the UART at all. The same design works fine in Radiant and also with yosys+Radiant.

Out of curiosity I also tried some other clock divisors. Higher than 6 cannot meet timing in nextpnr. 7, 8, and 9 all appear not to boot but interestingly a clock divisor of 10 also works fine.

I'm not sure how I can narrow down the problem any further. I suppose the next thing to rule out is whether the issue is purely in the UART signal generation or if the CPU itself is not even booting. We don't have any user LED on this board but I suppose I could hook up a spare pin to a CSR and have the CPU twiddle that, to see if it ever starts executing any instructions. If you have any other ideas of how to debug, or things you would like me to test, I would be happy to.

gatecat commented 3 years ago

Thanks, this info is already useful. Please could you provide (email to gatecat@ds0.me if you don't want them public) the JSON files, FASM output from nextpnr for both the working and non working cases, and also the working bitstreams from Radiant?

Is the CPU still booting from or attempting to access SPI flash? If it is and you have access to a logic analyzer, this would be one way to determine where/how it is getting stuck.

Finally, it would be useful to know if other small changes to the design (eg constants that don't affect much else) also result in changes between working/not working, or whether it is consistently only the oscillator frequency that matters.

danc86 commented 3 years ago

Forgot to mention, I am working on soc/hps_soc.py in CFU-Playground.git but not using the actual CFU stuff, nor am I using the normal test program it loads into the bitstream. I'm not using any of the CFU-Playground Makefiles, I am just building the bitstream by just invoking:

soc/hps_soc.py --toolchain oxide --build

so it's just building Litex BIOS into an embedded ROM and nothing extra.

On that branch, the SPI flash is still mapped onto Wishbone, but the CPU is not executing from it and won't access it at all (unless you tell it to using some BIOS commands over the UART).

The designs are all derived from public code so I will collect the various build outputs and attach them here.

danc86 commented 3 years ago

I've tarred up the build outputs to avoid upsetting Github: prjoxide-issue-10.tar.gz

I'll think of some unrelated perturbations to test, to see if it really is the clock divisor.

danc86 commented 3 years ago

I came up with three variations on the above design, which I tested independently with both clock divisor 6 and 7:

I am not really sure what to make of these results:

system clock divisor remove I2CMaster UART_SPEED=460800 --with-litespi
6 good good bad
7 good bad good
danc86 commented 3 years ago

These unpredictable results depending on the clock divisor might explain why @tcal-x was having mysterious problems where one day his design wouldn't work and then later it would -- perhaps he was also modifying the clock divisor at the same time. We've certainly had to adjust it many times to account for how well Radiant (and Yosys) can squeeze the timing.

gatecat commented 3 years ago

I'm starting to come up with some ideas what the problem might be, which is down to some random routing differences - the FASM files for all of the above tests would be useful, too.

gatecat commented 3 years ago

Oh, one other small thing, please could you provide the PDC file and nextpnr command line to build those specimens, so I can fully reproduce your setup?

danc86 commented 3 years ago

FASM files matching the above variations:

system clock divisor remove I2CMaster UART_SPEED=460800 --with-litespi
6 clkdiv6-noi2c-good.fasm.txt clkdiv6-uart460800-good.fasm.txt clkdiv6-litespi-bad.fasm.txt
7 clkdiv7-noi2c-good.fasm.txt clkdiv7-uart460800-bad.fasm.txt clkdiv7-litespi-good.fasm.txt
danc86 commented 3 years ago

Oh, oops. I should have realised you would need the PDC file and command line as well. Those are in the Litex gateware output directory too. I'll re-run it all tomorrow and collect the complete output directory.

The Litex scripts in CFU-Playground.git are pretty easy to use so you could also grab my hacked branch and run them from there to reproduce, if you wanted.

gatecat commented 3 years ago

Thanks, can you see if https://github.com/YosysHQ/nextpnr/pull/730 helps the issues at all?

danc86 commented 3 years ago

I built nextpnr-nexus from your PR YosysHQ/nextpnr#730 (commit 3d528ad) and now all the above "bad" designs produce a working bitstream. So I reckon you are onto something :-)

gatecat commented 3 years ago

Great, thanks for your help testing this!