Open stettberger opened 1 year ago
Hi,
One very possible cause to your slow boot issue is that the linux kernel tick too fast for a too slow system. By default the linux tick rate is 250hz which is fast for a 25 Mhz single core adding on that 3 other cores generating traffic (and ticking as well), + the slow wishbone memory, it is very probable that the quad core is only using time ticking and can't do any forward progress :)
There is a config in the in the kernel that you can enable : CONFIG_HZ_100 Let's me know how things goes :D
@Dolu1990 CONFIG_HZ_100 really helped a lot. I'm now able to build and boot a quadcore system (see below). It was not enough to only enable 100 Hz ticks, but I also had to increase the clock frequency to 40 Mhz (with 45 Mhz in reach):
Info: Max frequency for clock '$glbnet$crg_clkout0': 45.77 MHz (PASS at 40.00 MHz)
For this repo:
make.py
decrease the clockspeed for > 1 core?I will now investigate on the framebuffer
Boot log (40 Mhz, 4 cores, CONFIG_HZ_100, no framebuffer)
--============= Liftoff! ===============--
OpenSBI v1.3
____ _____ ____ _____
/ __ \ / ____| _ \_ _|
| | | |_ __ ___ _ __ | (___ | |_) || |
| | | | '_ \ / _ \ '_ \ \___ \| _ < | |
| |__| | |_) | __/ | | |____) | |_) || |_
\____/| .__/ \___|_| |_|_____/|____/_____|
| |
|_|
Platform Name : LiteX / VexRiscv-SMP
Platform Features : medeleg
Platform HART Count : 8
Platform IPI Device : aclint-mswi
Platform Timer Device : aclint-mtimer @ 100000000Hz
Platform Console Device : litex_uart
Platform HSM Device : ---
Platform PMU Device : ---
Platform Reboot Device : ---
Platform Shutdown Device : ---
Platform Suspend Device : ---
Platform CPPC Device : ---
Firmware Base : 0x40f00000
Firmware Size : 376 KB
Firmware RW Offset : 0x40000
Firmware RW Size : 120 KB
Firmware Heap Offset : 0x52000
Firmware Heap Size : 48 KB (total), 3 KB (reserved), 8 KB (used), 36 KB (free)
Firmware Scratch Size : 4096 B (total), 452 B (used), 3644 B (free)
Runtime SBI Version : 1.0
Domain0 Name : root
Domain0 Boot HART : 0
Domain0 HARTs : 0*,1*,2*,3*,4*,5*,6*,7*
Domain0 Region00 : 0xf0018000-0xf001bfff M: (I,R,W) S/U: ()
Domain0 Region01 : 0xf0010000-0xf0017fff M: (I,R,W) S/U: ()
Domain0 Region02 : 0x40f40000-0x40f5ffff M: (R,W) S/U: ()
Domain0 Region03 : 0x40f00000-0x40f3ffff M: (R,X) S/U: ()
Domain0 Region04 : 0x00000000-0xffffffff M: (R,W,X) S/U: (R,W,X)
Domain0 Next Address : 0x40000000
Domain0 Next Arg1 : 0x40ef0000
Domain0 Next Mode : S-mode
Domain0 SysReset : yes
Domain0 SysSuspend : yes
Boot HART ID : 0
Boot HART Domain : root
Boot HART Priv Version : v1.10
Boot HART Base ISA : rv32ima
Boot HART ISA Extensions : zicntr
Boot HART PMP Count : 0
Boot HART PMP Granularity : 0
Boot HART PMP Address Bits: 0
Boot HART MHPM Count : 0
Boot HART MIDELEG : 0x00000222
Boot HART MEDELEG : 0x0000b109
[ 0.000000] Linux version 6.1.0-rc2 (stettberger@obelix) (riscv32-buildroot-linux-gnu-gcc.br_real (Buildroot 2023.08-756-g3f23277c41) 11.4.0, GNU ld (GNU Binutils) 2.40) #2 SMP Fri Oct 20 09:19:19 CEST 2023
[ 0.000000] earlycon: liteuart0 at I/O port 0x0 (options '')
[ 0.000000] Malformed early option 'console'
[ 0.000000] earlycon: liteuart0 at MMIO 0xf0001000 (options '')
[ 0.000000] printk: bootconsole [liteuart0] enabled
[ 0.000000] OF: reserved mem: OVERLAP DETECTED!
[ 0.000000] mmode_resv1@40f00000 (0x40f00000--0x40f40000) overlaps with opensbi@40f00000 (0x40f00000--0x40f80000)
[ 0.000000] OF: reserved mem: OVERLAP DETECTED!
[ 0.000000] opensbi@40f00000 (0x40f00000--0x40f80000) overlaps with mmode_resv0@40f40000 (0x40f40000--0x40f60000)
[ 0.000000] Zone ranges:
[ 0.000000] Normal [mem 0x0000000040000000-0x0000000041ffffff]
[ 0.000000] Movable zone start for each node
[ 0.000000] Early memory node ranges
[ 0.000000] node 0: [mem 0x0000000040000000-0x0000000040efffff]
[ 0.000000] node 0: [mem 0x0000000040f00000-0x0000000040f5ffff]
[ 0.000000] node 0: [mem 0x0000000040f60000-0x0000000041ffffff]
[ 0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x0000000041ffffff]
[ 0.000000] SBI specification v1.0 detected
[ 0.000000] SBI implementation ID=0x1 Version=0x10003
[ 0.000000] SBI TIME extension detected
[ 0.000000] SBI IPI extension detected
[ 0.000000] SBI RFENCE extension detected
[ 0.000000] SBI HSM extension detected
[ 0.000000] riscv: base ISA extensions aim
[ 0.000000] riscv: ELF capabilities aim
[ 0.000000] percpu: Embedded 8 pages/cpu s11476 r0 d21292 u32768
[ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 8128
[ 0.000000] Kernel command line: console=liteuart earlycon=liteuart,0xf0001000 rootwait root=/dev/ram0
[ 0.000000] Dentry cache hash table entries: 4096 (order: 2, 16384 bytes, linear)
[ 0.000000] Inode-cache hash table entries: 2048 (order: 1, 8192 bytes, linear)
[ 0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[ 0.000000] Memory: 15780K/32768K available (5855K kernel code, 577K rwdata, 908K rodata, 215K init, 254K bss, 16988K reserved, 0K cma-reserved)
[ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[ 0.000000] rcu: Hierarchical RCU implementation.
[ 0.000000] rcu: RCU restricting CPUs from NR_CPUS=32 to nr_cpu_ids=4.
[ 0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies.
[ 0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=4
[ 0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
[ 0.000000] riscv-intc: 32 local interrupts mapped
[ 0.000000] plic: interrupt-controller@f0c00000: mapped 32 interrupts with 4 handlers for 8 contexts.
[ 0.000000] rcu: srcu_init: Setting srcu_struct sizes based on contention.
[ 0.000000] riscv-timer: riscv_timer_init_dt: Registering clocksource cpuid [0] hartid [0]
[ 0.000000] clocksource: riscv_clocksource: mask: 0xffffffffffffffff max_cycles: 0x939a85c40, max_idle_ns: 440795202120 ns
[ 0.000111] sched_clock: 64 bits at 40MHz, resolution 25ns, wraps every 4398046511100ns
[ 0.021973] Console: colour dummy device 80x25
[ 0.028531] Calibrating delay loop (skipped), value calculated using timer frequency.. 80.00 BogoMIPS (lpj=400000)
[ 0.041532] pid_max: default: 32768 minimum: 301
[ 0.070511] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[ 0.080703] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[ 0.297571] ASID allocator using 9 bits (512 entries)
[ 0.322001] rcu: Hierarchical SRCU implementation.
[ 0.328738] rcu: Max phase no-delay instances is 1000.
[ 0.407713] smp: Bringing up secondary CPUs ...
[ 0.761056] smp: Brought up 1 node, 4 CPUs
[ 0.881888] devtmpfs: initialized
[ 1.502438] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
[ 1.517298] futex hash table entries: 1024 (order: 4, 65536 bytes, linear)
[ 1.903302] NET: Registered PF_NETLINK/PF_ROUTE protocol family
[ 1.967673] DMA: preallocated 128 KiB GFP_KERNEL pool for atomic allocations
[ 4.400822] pps_core: LinuxPPS API ver. 1 registered
[ 4.405901] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
[ 4.423883] PTP clock support registered
[ 4.463479] FPGA manager framework
[ 4.632870] clocksource: Switched to clocksource riscv_clocksource
[ 8.894151] NET: Registered PF_INET protocol family
[ 8.966209] IP idents hash table entries: 2048 (order: 2, 16384 bytes, linear)
[ 9.195962] tcp_listen_portaddr_hash hash table entries: 512 (order: 0, 4096 bytes, linear)
[ 9.216777] Table-perturb hash table entries: 65536 (order: 6, 262144 bytes, linear)
[ 9.236527] TCP established hash table entries: 1024 (order: 0, 4096 bytes, linear)
[ 9.256715] TCP bind hash table entries: 1024 (order: 2, 16384 bytes, linear)
[ 9.283659] TCP: Hash tables configured (established 1024 bind 1024)
[ 9.336840] UDP hash table entries: 256 (order: 1, 8192 bytes, linear)
[ 9.357005] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes, linear)
[ 9.645460] Unpacking initramfs...
[ 10.223651] workingset: timestamp_bits=30 max_order=12 bucket_order=0
[ 18.954019] io scheduler mq-deadline registered
[ 18.972990] io scheduler kyber registered
[ 20.033521] LiteX SoC Controller driver initialized
[ 41.796930] Initramfs unpacking failed: invalid magic at start of compressed archive
[ 49.912621] Freeing initrd memory: 8192K
[ 61.923563] f0001000.serial: ttyLXU0 at MMIO 0x0 (irq = 0, base_baud = 0) is a liteuart
[ 61.936669] printk: console [liteuart0] enabled
[ 61.936669] printk: console [liteuart0] enabled
[ 61.956959] printk: bootconsole [liteuart0] disabled
[ 61.956959] printk: bootconsole [liteuart0] disabled
[ 62.683697] i2c_dev: i2c /dev entries driver
[ 63.137182] litex-mmc f0004800.mmc: LiteX MMC controller initialized.
[ 63.804717] NET: Registered PF_INET6 protocol family
[ 64.425385] mmc0: new SDXC card at address b368
[ 64.795474] Segment Routing with IPv6
[ 64.856904] In-situ OAM (IOAM) with IPv6
[ 64.886988] sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
[ 65.254812] NET: Registered PF_PACKET protocol family
[ 65.605240] mmcblk0: mmc0:b368 NCard 58.2 GiB
[ 66.035890] Freeing unused kernel image (initmem) memory: 212K
[ 66.055103] Kernel memory protection not selected by kernel config.
[ 66.067736] Run /init as init process
[ 66.192412] mmcblk0: p1
Saving 256 bits of non-creditable seed for next boot
Starting syslogd: OK
Starting klogd: OK
Running sysctl: OK
Starting network: OK
Welcome to Buildroot
buildroot login: root
__ _
/ / (_)__ __ ____ __
/ /__/ / _ \/ // /\ \ /
/____/_/_//_/\_,_//_\_\
/ _ \/ _ \
__ _ __ _ _\___/_//_/ ___ _
/ / (_) /____ | |/_/__| | / /____ __ / _ \(_)__ _____ __
/ /__/ / __/ -_)> </___/ |/ / -_) \ // , _/ (_-</ __/ |/ /
/____/_/\__/\__/_/|_|____|___/\__/_\_\/_/|_/_/___/\__/|___/
/ __/ |/ / _ \
_\ \/ /|_/ / ___/
/___/_/ /_/_/
32-bit RISC-V Linux running on LiteX / VexRiscv-SMP.
login[90]: root login on 'console'
cat /proc/root@buildroot:~# cat /proc/cpuinfo
processor : 0
hart : 0
isa : rv32ima
mmu : sv32
mvendorid : 0x0
marchid : 0x0
mimpid : 0x0
processor : 1
hart : 1
isa : rv32ima
mmu : sv32
mvendorid : 0x0
marchid : 0x0
mimpid : 0x0
processor : 2
hart : 2
isa : rv32ima
mmu : sv32
mvendorid : 0x0
marchid : 0x0
mimpid : 0x0
processor : 3
hart : 3
isa : rv32ima
mmu : sv32
mvendorid : 0x0
marchid : 0x0
mimpid : 0x0
Nice :)
Would it make sense to always set HZ_100?
Yes, i think you are right, that would be a more portable default.
Also, if possible avoid --with-wishbone-memory as it realy hit the memory bandwidth between the SoC and the SDRAM
The framebuffer is also probably starving data, i had that when for instance i was running digilent video at 50 Mhz, just got blackscreen. Not sure what is your actual resolution ?
Currently, I have not changed the default settings of framebuffer at all. Therefore, I cannot tell about the resolution that the framebuffer has. In the end, I would be more than happy to have a 80x25 text terminal (as in former times with CGA and Hercules cards).
For the 100Hz: There is pull request #363
Ok. So when using the framebuffer, the system seems to be memory starving on the ulx3s. However, If I enable the "video_terminal"
, I can synthesize a quad-core with HDMI out. And while it is working, I get some timing warning:
Warning: Max frequency for clock '$glbnet$ecp5pll1_clkout1': 97.14 MHz (FAIL at 125.00 MHz)
Info: Max frequency for clock '$glbnet$ecp5pll1_clkout0': 70.81 MHz (PASS at 25.00 MHz)
Info: Max frequency for clock '$glbnet$ecp5pll0_clkout0': 48.69 MHz (PASS at 40.00 MHz)
Info: Clock '$glbnet$ecp5pll0_clkout1' has no interior paths
Info: Max delay <async> -> posedge $glbnet$ecp5pll0_clkout0: 12.49 ns
Info: Max delay <async> -> posedge $glbnet$ecp5pll1_clkout0: 8.66 ns
Info: Max delay <async> -> posedge $glbnet$ecp5pll1_clkout1: 8.96 ns
Info: Max delay posedge $glbnet$ecp5pll0_clkout0 -> <async> : 5.25 ns
Info: Max delay posedge $glbnet$ecp5pll0_clkout0 -> posedge $glbnet$ecp5pll1_clkout0: 3.87 ns
Info: Max delay posedge $glbnet$ecp5pll0_clkout0 -> posedge $glbnet$ecp5pll1_clkout1: 4.16 ns
Info: Max delay posedge $glbnet$ecp5pll1_clkout0 -> posedge $glbnet$ecp5pll0_clkout0: 1.46 ns
Info: Max delay posedge $glbnet$ecp5pll1_clkout0 -> posedge $glbnet$ecp5pll1_clkout1: 1.24 ns
Info: Max delay posedge $glbnet$ecp5pll1_clkout1 -> posedge $glbnet$ecp5pll1_clkout0: 1.48 ns
It does not seem to be a problem, however, I'm not sure what it means that the pll1_clkout1 is less than 125 Mhz.
Further, the video_terminal is not to useful with Linux using the uart for its login shell.
My Next Step: I also think that it should be doable to modify "VideoTerminal" so that it (a) does not sniff the uart sink for its data, but that it (b) exposes the video and the font memory via wishbone. Thereby, I should become able to emulate a good old text-mode graphic card.
@Dolu1990 Would it make sense (in the multi-core mode) to (a) Disable framebuffer for the ULX3S Board (b) reduce the clock frequency?
Hi,
pll1_clkout
I think that is to drive the 640x480@75Hz
Further, the video_terminal is not to useful with Linux using the uart for its login shell.
See https://github.com/SpinalHDL/NaxSoftware/tree/main/debian_litex#boot-console
So console=tty1 in the linux bootcmd may fix it ?
If you want, you can try to get the USB host working. (there is a OHCI controller in litex)
Would it make sense (in the multi-core mode) to (a) Disable framebuffer for the ULX3S Board (b) reduce the clock frequency?
I'm not sure. There is so many possibility / use cases. I would say people need to customiz for their use case.
TL;DR: Multi-Core System does not boot Linux. Framebuffer makes everything worse
Hello! I have the goal to build a quad-core RISC-V system that I load onto an ULX3S Board (ECP-85F). In the long term, I want to use this setup to give university-level lecture on multi-core operating system construction.
At this point, I want to boot a Linux system on a synthesized quadcore to test the gateware, before I start porting my own software there, hunting gateware-level bugs. Also: I'm aware of other softcores (Rocket, NaxRiscv, PULP) but my current goal is to make it running with the VexRiscv.
Problem
I am not able to get a multi-core system with > 2 cores running as it hangs before the user-land fully starts. The last line on the litex_terminal is:
Suspicion: I suspect two problems here: There is some multicore problem and the litex framebuffer yields unfavorable logic for the ulx3s.
Overview
I got a dual core system to boot with
--board=ulx3s --cpu-count=2 --with-wishbone-memory --device LFE5U-85F
. However, I had to reduce the system clock to 25Mhz and disable the "framebuffer". Besides from that, I tried the following configurations:[bad] 4 cores, No timing warning, boot hangs at "FPGA manager framework", 25 Mhz
Also, with the framebuffer enabled, I had tremendously long nextpnr runtime (> 1.5 h).
Detail for 4 cores
As an example, I will provide the some logs for my last attempt (4 cores, 25 Mhz, no framebuffer, no whishbone memory).
For this test, I built the buildroot by myself. but I also tried the downloaded/pre-built buildroot before, and it gave the same problem.
Version Info:
Building it:
Loading it:
Boot Log
Working Variant
For comparision reasons, I also attached the boot log form the working 2 core machine (wishbone memory, 25 Mhz, no framebuffer):