SpinalHDL / SaxonSoc

SoC based on VexRiscv and ICE40 UP5K
MIT License
150 stars 40 forks source link

SAXON_CPU_COUNT >4 woes #64

Open soundnut opened 3 years ago

soundnut commented 3 years ago

Hi,

Great update to the readme - thanks.

Out of curiosity, I've been playing with the latest version and wanted to find the max I can do with an 85k Ulx3s board. Mainly I wanted to see how many hearts I can fit. SAXON_CPU_COUNT=6 seems to work to create a bitstream with 6 cores (at least I think it does - in the build log I see references to the additional cores)

Linux is a different story. I added two more cpu definition blocks to ./buildroot-spinal-saxon/boards/common/dts/linux_cpu.dts.

linux boots but reports cpu 4 and 5 as failed to start [ 0.117618] smp: Bringing up secondary CPUs ... [ 0.194216] CPU4: failed to start [ 0.212434] CPU5: failed to start [ 0.214449] smp: Brought up 1 node, 4 CPUs

unsure if ./buildroot-spinal-saxon/boards/common/dts/linux_plic_link.dts needs extending too. Please advise

Trying to digg a little deeper I found that u-boot only reports 4 cpus => cpu list 0: cpu@0 rv32ima 1: cpu@1 rv32ima 2: cpu@2 rv32ima 3: cpu@3 rv32ima

found uboot.dts and tried adding 2 more cpu definitions (./buildroot-spinal-saxon/boards/spinal-saxon/ulx3s/u-boot/uboot.dts) but still only 4 cpus in linux and uboot

poking around some more, I found this uboot config file (in ./build/uboot-smp-latest/configs/saxon_bsp_defconfig) with the default of 4 cpus. Changing CONFIG_NR_CPUS from 4 to 6 doesn't seem to stick though. it is overwritten in every run of saxon_buildroot Performing just saxon_buildroot_compile after the change prevents it from being overwritten but still doesn't solve the problem

Any idea what I'm missing?

Thanks

Dolu1990 commented 3 years ago

Maybe this : https://github.com/SpinalHDL/buildroot-spinal-saxon/blob/main/boards/spinal-saxon/arty-a7-smp/opensbi/platform.c#L23 ?

soundnut commented 3 years ago

You're the man! Great stuff - yes, that made the difference

[ 0.117988] smp: Bringing up secondary CPUs ... [ 0.218975] smp: Brought up 1 node, 6 CPUs

root@buildroot:~# cat /proc/cpuinfo processor : 0 hart : 4 isa : rv32ima mmu : sv32

processor : 1 hart : 0 isa : rv32ima mmu : sv32

processor : 2 hart : 1 isa : rv32ima mmu : sv32

processor : 3 hart : 2 isa : rv32ima mmu : sv32

processor : 4 hart : 3 isa : rv32ima mmu : sv32

processor : 5 hart : 5 isa : rv32ima mmu : sv32

Overall, 6 hearts or 4 hearts plus fpu seems to be about the maximum that's doable with this tiny board. (with 32bit)

with 6 hearts I get 95% TRELLIS_SLICE utilization. Still some LUTs left though - at 74%.

How hard are the clocks configured? would a smaller setup - say 2 cores plus fpu - work with a higher clock rate? the fpga should be able to handle higher rates according to the specs.

Thanks again for your help. Cheers

Dolu1990 commented 3 years ago

Cool ^^

How hard are the clocks configured?

About 52 Mhz, are the timing passing with 6 cores ?

the fpga should be able to handle higher rates according to the specs.

Which spec ?

soundnut commented 3 years ago

Re timing - with 6 hearts I get this: Warning: Max frequency for clock '$glbnet$clocking_pll_clkout2': 39.93 MHz (FAIL at 52.08 MHz) Info: Max frequency for clock '$glbnet$clocking_rmii_clk$TRELLIS_IO_IN': 73.10 MHz (PASS at 50.00 MHz) Info: Max frequency for clock '$glbnet$clocking_pll_clkout0': 170.53 MHz (PASS at 125.00 MHz) Info: Max frequency for clock '$glbnet$clocking_pll_clkout3': 57.25 MHz (PASS at 25.00 MHz) Info: Max frequency for clock '$glbnet$debug_jtag_tck$TRELLIS_IO_IN': 93.79 MHz (PASS at 50.00 MHz)

Re spec https://www.latticesemi.com/view_document?document_id=50461 - chapter 3.19 not sure though how far you can stretch the internal clock with a 25MHz input.

as far as I understand, it all starts with a 25MHz input oscillator. The cores are currently configured to run at 52MHz and memory at around 100MHz. Where are these ratios configured? could we try to double up? i.e run a single core at 100MHz and memory around 200MHz?

soundnut commented 3 years ago

Can you tell me how the clock domains are being used? Ulx3s clkout0 125 \ -HDMI clkout1 100 \ - Memory clkout2 50 \ - main Heart clock clkout3 25 \ - VGA ArtyA7 and NexysA7 seem to have more clock domains (Arty 0-5 and Nexys 0-6)

I assume Memory needs to run 2x faster than main Heart from previous posts, clkout0 and 3 are presumably fix at these levels. So if I want to play with higher clock frequencies on the Ulx3s, clkout1 and clkout2 are the ones to increase and to keep at a 1:2 ratio. Correct?

how does this setting in Ulx3sSmp.scala play into all of this? frequency = FixedFrequency(52 MHz),

Thanks

Dolu1990 commented 3 years ago

Where are these ratios configured

You will have to update https://github.com/SpinalHDL/SaxonSoc/blob/dev-0.3/hardware/scala/saxon/board/radiona/ulx3s/Ulx3sSmp.scala#L227

And also https://github.com/SpinalHDL/SaxonSoc/blob/dev-0.3/hardware/synthesis/radiona/ulx3s/smp/pll_linux.v via https://github.com/SpinalHDL/SaxonSoc/blob/dev-0.3/hardware/synthesis/radiona/ulx3s/smp/makefile#L57 i gess.

could we try to double up?

You can't overclock things for ever, already the synthesis tool isn't happy right now : 39.93 MHz (FAIL at 52.08 MHz)

Going higher is asking for troubles XD

I assume Memory needs to run 2x faster, clkout0 and 3 are presumably fix at these levels. Correct

Right

how does this setting in Ulx3sSmp.scala play into all of this? frequency = FixedFrequency(52 MHz),

Yes you only need to update that one.

soundnut commented 3 years ago

update 5 hearts plus FPU are fitting on the 85k Ulx3s. Routing took over 29 hrs to complete. (For comparison, the 6 heart config took slightly more than 1 hr to place) So this seems to be the max in terms of cores that are possible. 5 cores plus FPU or 6 cores without FPU.

Next I'm going to try playing with the frequency - will start small and build up until things start to break. FUN stuff!

Dolu1990 commented 3 years ago

FUN stuff!

Freedoom ^^ This kind of situation scare me as hell, as then i'm scared that the CPU design is bugy and i would have to spend weeeeeeks to find the bug XD

soundnut commented 3 years ago

This I can understand. My intention is not to poke holes into this - just to learn the concepts and how this is all pieced together and maybe contribute small bits and pieces here and there. While I appreciate fixes and solutions, I'm also content if you point me into the right direction so that I can try to solve the puzzle myself. Like the input regarding frequency - knowing where to look is really helpful and I'm happy with that.