Fail to boot Linux with FPU enabled

AlanVek commented 1 year ago

I successfully booted Linux with FPU disabled. When I enable FPU, I get the following errors:

[    0.156632] smp: Bringing up secondary CPUs ...                                                                                                                                             
[    0.157379] smp: Brought up 1 node, 1 CPU                                                                                                                                                   
[    0.178498] devtmpfs: initialized                                                                                                                                                           
[    0.256544] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns                                                                                 
[    0.263653] futex hash table entries: 256 (order: , 16384 bytes, linear)                                                                                                                    
[    0.295184] NET: Registered PF_NETLINK/PF_ROUT protocol family                                                                                                                              
[    0.767033] ------------[ cut here ]------------                                                                                                                                            
[    0.767761] WARNING: CPU: X PID: 2 at kernel/rcu/tree.c:279 rcu_core+0x428/0x                                                                                                               
[    0.768925] CPU: 0 PID: p Comm: kthreaddNot tainted5.18.0-rc7 #5                                                                                                                            
[    0.780034] epc : rcu_core+0x428/0x468                                                                                                                                                      
[    0.783741]  ra : rcu_core_si+0x14/0x24                                                                                                                                                     
[    0.787560] epc : c006c304 ra : c006c494 sp : c082dc70                                                                                                                                      
[    0.792677]  gp : c06f5650 tp : c08386c0 t0 : 02db1c3e                                                                                                                                      
[    0.797791]  t1 : c066e480 t2 : 00000000 s0 : c082dcc0                                                                                                                                      
[    0.802914]  s1 : cfdf6880 a0 : c066e4a4 a1 : 00000000                                                                                                                                      
[    0.808033]  a2 : 00000000 a3 : c06f6130 a4 : 0f85e000                                                                                                                                      
[    0.813147]  a5 : 000000 a6 : 0000000 a7 : 54494d45                                                                                                                                         
[    0.818277]  s2 : c06f6130 s3 : c06d8340 s4 : 0000ffff                                                                                                                                      
[    0.823396]  s5 : cfdf68c0 s6 : c05961d8 s7 : c0596168                                                                                                                                      
[    0.828516]  s8 : c06f6000 s9 : c066e4a4 s10: 00000100                                                                                                                                      
[    0.833635]  s11: 00000200 t3 : 0000000 t4 : ffffffff                                                                                                                                       
[    0.838752]  t5 : 0596944e t6 : 9aca0000                                                    
[    0.842654] status: 00000120 badaddr: 00100073 cause: 00000003                                                                                                                              
[    0.848492] [<c006c494>]rcu_core_si+0x1      0x                                                                                                                                             
[    0.852907] [<c0563b14>]__do_softirq+0x1a4/0x340                                                                                                                                            
[    0.857592] [<c001098c>] irq_exit+0xc0/0x108                                                
[    0.861842] [<c055b328>]generic_+0x88/0xa8                                                  
[    0.866009] [<c00022e8>]ret_from_exception+0x0/0x10                                                                                                                                         
[    0.870972] ---[ end trace00000000000000> ]---

After that:

[    1.658933] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes, linear)                                                                                                       [173/1730]
[    1.753677] Unpacking initramfs..7                                                          
[    1.820060] Unable to handle kernel paging request at virtual address 00800000                                                                                                              
[    1.822721] Oops [#1]                                                                       
[    1.824850] CPU: 0 PID: p Comm: kworker/u2:0Tainted: G        W        5.18.0-rc7 #5                                                                                                        
[    1.832762] Workqueue: events_unbound                                                       
[    1.836448] epc : xas_create+0x ra : xas_tore+0xepc :4c054f91 ra : c0550330 sp :0c0845ab0                                                                                                   
[    1.844765]  gp : c06f5650 tp : c083af40 t0 : 02070063                                                                                                                                      
[    1.849882]  t1 : c0144254 t2 : 01000593 s0 : c0845b00                                                                                                                                      
[    1.855002]  s1 : 00000000 a0 : c0845b60 a1 : 00000001                                                                                                                                      
[    1.860118]  a2 : 00000000 a3 : 00000002 a4 : 00800000                                                                                                                                      
[    1.865238]  a5 : 0000001f a6 : 00000000 a7 : 00003f                                                                                                                                        
[    1.870357]  s2 : cfe11940 s3 : c180aa20 s4 : 00000cc0                                                                                                                                      
[    1.875476]  s5 : c0845b60 s6 : c180aa24 s7 : c180aa2c                                                                                                                                      
[    1.880597]  s8 : c0845bc8 s9 : c0845b60 s10: 00000001                                                                                                                                      
[    1.885715]  s11: ffffffbe t3 : 775920ef t4 : 00050993                                                                                                                                      
[    1.890833]  t5 : 01812783 t6 : c097a000                                                    
[    1.894737] status: 00000100 badaddr: 00800000 cause: 00000d                                                                                                                                
[    1.900577] [<c0550330>]xas_Qtore+0x                                                        
[    1.901702] [<c009fc0c>]__filemap_add_folio+0x16c/0x3                                                                                                                                       
[    1.904208] [<c009fe5c>]filemap_add_folio+0x4c/0xd8                                                                                                                                         
[    1.914272] [<c00a1400>]__filemap_get_folio+0x160/0x360                                                                                                                                     
[    1.919563] [<c00ab454>]pagecache_get_page+0x28/0x90                                                                                                                                        
[    1.924596] [<c00ab4e8>]grab_cache_page_write_begin+0x2c/0x3c                                                                                                                               
[    1.930411] [<c0144298>]simple_write_begin+0x44/0x1c8                                                                                                                                       
[    1.935529] [<c00a3da0>]generic_perform_write+0xd4/0x25c                                                                                                                                    
[    1.940909] [<c00a4084>]__generic_file_write_iter+0x15c/0x1d4                                                                                                                               
[    1.946721] [<c00a418c>]generic_file_write_iter+0x90/0xf0                                                                                                                                   
[    1.952188] [<c010bb3c>]__kernel_write+0x130/                                               
[    1.953299] [<c010bd44>]kernel_writeTwx13000x2+0x6                                                                                                                                          
[    1.956621] [<c05664a8>]xwriQe.constpro                                                     
[    1.962007] [<c0566574>]do_copy+0x70/0x140                                                  
[    1.970067] [<c0565e64>]write_buffer+0x44/0x68                                                                                                                                              
[    1.974581] [<c0566a50>]unpack_to_rootfs+0x130/0x310                                                                                                                                        
[    1.979612] [<c0566ed4>] do_populate_rootfs+0x6c/0xc4                                                                                                                                       
[    1.984638] [<c00319f4>]async_run_entry_fn+0x38/0xd8                                                                                                                                        
[    1.989671] [<c0025028>]process_one_work+0x208/0x394                                                                                                                                        
[    1.994703] [<c0025314>]worker_thread+0x160/0x4f0                                                                                                                                           
[    1.999475] [<c002d670>]kthread+0xc8/0xdc                                                   
[    2.003550] [<c00022e8>]ret_from_exception+0x0/0x10                                                                                                                                         
[    2.032085] ---[ end trac0000000000000 ]---

After that, some more warnings/errors, and finally:

3    3.472540] WARNING: CPU:  PID: 1 at lib/kobject.c:20 kobject_add_internal+0x314/0x358                                                                                                      
[    3.477585] CPU: 0 PID:  Comm: swapper/0 Tainted: G      D W         5.18.0-rc7#5                                                                                                           
[    3.485200] epc : kobject_add_internal+0x314/0x358                                                                                                                                          
[    3.489948]  ra : kobject_add_internal+0x314/0x358                                                                                                                                          
[    3.494722] epc : c053cdfc ra : c053cdfc sp : c0829dc0                                                                                                                                      
[    3.499840]  gp : c06f5650 tp : c0838000 t0 : 00000000                                                                                                                                      
[    3.504958]  t1 : ffffffff t2 : 00000000 s0 : c0829de0                                                                                                                                      
[    3.510076]  s1 : c0896a18 a0 : 00000040 a1 : 3fffefff                                                                                                                                      
[    3.515192]  a2 : 00000000 a3 : fffffffe a4 : 9b3041d1                                                                                                                                      
[    3.520312]  a5 : 9b3041d1 a6 : c06cbf60 a7 : 3fffffff                                                                                                                                      
[    3.525433]  s2 : c0896a18 s3 : 00000000 s4 : c06f6000                                                                                                                                      
[    3.530551]  s5 : 000001 s6 : 0004200 s7 : c0638564                                                                                                                                         
[    3.535669]  s8 : 00000000 s9 : c06d9000 s10: c0638564                                                                                                                                      
[    3.540788]  s11: 000001c0 t3 : 0000000 t4 : 00000000                                                                                                                                       
[    3.545905]  t5 : 00000001 t6 : 00000000                                                    
[    3.549812] status: 00000120 badaddr: 00100073 cause: 00000003                                                                                                                              
[    3.555648] [<c053cf9c>]kobject_init_and_add00x70/0xbc                                                                                                                                      
[    3.560843] [<c01024b8>]sysfs_slab_add+0x14880x248                                                                                                                                          
[    3.565701] [<c0102d98>]__kmm_cachecrea                                                     
[    3.566879] [<c00c6704>] kmem_cache_create_usercopy+0x218    0x34                                                                                                                           
[    3.575678] [<c00c6820>]kmem_cache_create+0x20/0x30                                                                                                                                         
[    3.580631] [<c0576bd0>]     0x9s/0xcc                                                      
[    3.584101] [<c0564c8c>]do_one_initcall+                                                    
[    3.585151] [<c0565054>]kernl_init_freeable40x1b4/0x234                                                                                                                                     
[    3.593463] [<c055b7f8>]kernl_init+0x24                                                     
[    3.594576] [<c00022e8>]ret_from_exception+040/0x10                                                                                                                                         
3    3.602403] ---[ end trace00000000000000> ]---                                                                                                                                              
[    3.624467] Kernel panic - not syncing: kmem_cache_create_usercopy: Failed to create slab 'kioct     '. Error -2
[    3.629656] ---[ end Kernel panic - not syncing: kmem_cache_create_usercopy: Failed to create slab 'kioct    '. Error -2 ]-

This is all using the same bios.bin/opensbi.bin/Image/rootfs.cpio/rv32.dtb that the one that works if the bitfile doesn't have FPU (compiled with abi=ilp32 arch=rv32i).

I get the same errors if I specifically compile Linux/Buildroot using the FPU/ISA_F/ISA_D flags, and using abi=ilp32d and arch=rv32imafdc.

The configuration I'm using is:

--cpu-count 1 --dcache-width 64 --icache-width 64 --dcache-ways 1 --icache-ways 1 --without-coherent-dma --dtlb-size 4 --itlb-size 4 --dcache-size 4096 --icache-size 4096 --with-fpu --with-wishbone-memory --with-rvc

The board is a SmartFusion2 running at 80MHz without timing failures.

Please let me know if I can provide any more information.

Dolu1990 commented 1 year ago

Hi,

There was a recent update related to the FPU, it maybe related.

Can you send me your opensbi / Linux / Buildroot images ? Also which version of the linux kernel / opensbi are you using ?

What git hash do you have for the pythondata-cpu-vexriscv_smp ? Can you try with https://github.com/litex-hub/pythondata-cpu-vexriscv_smp/commit/e8ce95bbff2742226e838a37a88e4153bd04178a ?

Thanks :D

Dolu1990 commented 1 year ago

hmm also, if you can send me your linux elf file that would be great :D

AlanVek commented 1 year ago

sources.zip

Keep in mind that I'm not using the standard memory map, this is my boot.json: { "Image": "0xB0000000", "rv32.dtb": "0xB0ef0000", "rootfs.cpio": "0xB1000000", "opensbi.bin": "0xB0f00000" }

With RAM being from 0xb0000000 to 0xd0000000.

The linux kernel version is the latest one that comes with cloning http://github.com/buildroot/buildroot, and I'm using master for pythondata-cpu-vexriscv_smp.

For the Linux elf file give me a little bit of time, because I need to regenerate it. In the meantime, I'll try with the commit you suggested.

Thanks for the quick response!

AlanVek commented 1 year ago

I still get the same errors with the suggested commit.

This is my .elf: linux.elf.zip

AlanVek commented 1 year ago

If it helps, this is my linux.config:

CONFIG_SECTION_MISMATCH_WARN_ONLY=y

# Architecture
CONFIG_ARCH_DEFCONFIG="arch/riscv/configs/defconfig"
CONFIG_NONPORTABLE=y
CONFIG_ARCH_RV32I=y
CONFIG_RISCV_ISA_M=y
CONFIG_RISCV_ISA_A=y
CONFIG_RISCV_ISA_C=y
CONFIG_SIFIVE_PLIC=y
CONFIG_FPU=y
CONFIG_SMP=y
CONFIG_STRICT_KERNEL_RWX=n
CONFIG_EFI=n
CONFIG_HVC_RISCV_SBI=y

CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_RD_GZIP=y
CONFIG_RD_BZIP2=y
CONFIG_RD_LZMA=y
CONFIG_RD_XZ=y
CONFIG_RD_LZO=y
CONFIG_RD_LZ4=y

# FPGA / SoC
CONFIG_FPGA=y
CONFIG_FPGA_MGR_LITEX=y
CONFIG_LITEX_SOC_CONTROLLER=y
CONFIG_LITEX_SUBREG_SIZE=4

# Time
CONFIG_PRINTK_TIME=y

# Clocking
CONFIG_COMMON_CLK=y
CONFIG_COMMON_CLK_LITEX=y

# Interrupts
CONFIG_IRQCHIP=y
CONFIG_OF_IRQ=y
CONFIG_HANDLE_DOMAIN_IRQ=y
CONFIG_LITEX_VEXRISCV_INTC=y

# Ethernet
CONFIG_NET=n
CONFIG_PACKET=n
CONFIG_PACKET_DIAG=n
CONFIG_INET=n
CONFIG_NETDEVICES=n
CONFIG_NET_VENDOR_LITEX=n
CONFIG_LITEX_LITEETH=n

# Serial
CONFIG_SERIAL_EARLYCON_RISCV_SBI=y
CONFIG_SERIAL_LITEUART=y
CONFIG_SERIAL_LITEUART_CONSOLE=y

# GPIO
CONFIG_GPIO_SYSFS=y
CONFIG_GPIOLIB=y
CONFIG_GPIO_LITEX=y

# PWM
CONFIG_PWM=y
CONFIG_PWM_LITEX=y

# SPI
CONFIG_SPI=y
CONFIG_SPI_LITESPI=y
CONFIG_SPI_SPIDEV=y

# I2C
CONFIG_I2C=y
CONFIG_I2C_LITEX=y
CONFIG_I2C_CHARDEV=y

# Hardware monitoring
CONFIG_HWMON=y
CONFIG_SENSORS_LITEX_HWMON=y

# Framebuffer
CONFIG_FB=y
CONFIG_FB_SIMPLE=y
CONFIG_FRAMEBUFFER_CONSOLE=y
CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y
CONFIG_LOGO=y
CONFIG_DRM=y
CONFIG_DRM_LITEVIDEO=y

# Flash
CONFIG_MTD=y
CONFIG_MTD_SPI_NOR=y
CONFIG_SPI_FLASH_LITEX=y

# MMC
CONFIG_MMC=y
CONFIG_MMC_SPI=y
CONFIG_MMC_LITEX=y

CONFIG_EXT2_FS=y
CONFIG_EXT3_FS=y
CONFIG_EXT4_FS=y

# .config in kernel
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y

# Filesystem
CONFIG_FAT_FS=y
CONFIG_MSDOS_FS=y
CONFIG_MSDOS_PARTITION=y
CONFIG_VFAT_FS=y
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
CONFIG_NCPFS_SMALLDOS=y
CONFIG_NLS=y
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ASCII=y
CONFIG_NLS_ISO8859_1=y
CONFIG_TMPFS=y

CONFIG_HZ_100=y
CONFIG_RISCV_ISA_F=y
CONFIG_RISCV_ISA_D=y

And this is my litex_vexriscv_defconfig:

# Target options
BR2_riscv=y
BR2_RISCV_32=y

# Instruction Set Extensions
BR2_riscv_custom=y
BR2_RISCV_ISA_CUSTOM_RVM=y
BR2_RISCV_ISA_CUSTOM_RVA=y
BR2_RISCV_ISA_CUSTOM_RVC=y
BR2_RISCV_ISA_CUSTOM_RVF=y
BR2_RISCV_ISA_CUSTOM_RVD=y
BR2_RISCV_ABI_ILP32=y

# Patches
BR2_GLOBAL_PATCH_DIR="$(BR2_EXTERNAL_LITEX_VEXRISCV_PATH)/patches"

# GCC
BR2_GCC_VERSION_11_X=y

# System
BR2_TARGET_GENERIC_GETTY=y
BR2_TARGET_GENERIC_GETTY_PORT="console"

# Filesystem
BR2_TARGET_ROOTFS_CPIO=y

# Kernel (litex-rebase branch)
BR2_LINUX_KERNEL=y
BR2_LINUX_KERNEL_CUSTOM_GIT=y
BR2_LINUX_KERNEL_CUSTOM_REPO_URL="https://github.com/Dolu1990/litex-linux.git"
BR2_LINUX_KERNEL_CUSTOM_REPO_VERSION="ae80e67c6b48bbedcd13db753237a25b3dec8301"
BR2_LINUX_KERNEL_USE_CUSTOM_CONFIG=y
BR2_LINUX_KERNEL_CUSTOM_CONFIG_FILE="$(BR2_EXTERNAL_LITEX_VEXRISCV_PATH)/board/litex_vexriscv/linux.config"
BR2_LINUX_KERNEL_IMAGE=y

# Rootfs customisation
BR2_ROOTFS_OVERLAY="$(BR2_EXTERNAL_LITEX_VEXRISCV_PATH)/board/litex_vexriscv/rootfs_overlay"
BR2_GLOBAL_PATCH_DIR="$(BR2_EXTERNAL_LITEX_VEXRISCV_PATH)/patches"

#BR2_PACKAGE_HOST_LINUX_HEADERS_CUSTOM_5_18=y

# Extra packages
#BR2_PACKAGE_DHRYSTONE_OPT=y
#BR2_PACKAGE_MICROPYTHON=y
#BR2_PACKAGE_SPIDEV_TEST=y
#BR2_PACKAGE_MTD=y
#BR2_PACKAGE_MTD_JFFS_UTILS=y

# Crypto
#BR2_PACKAGE_LIBATOMIC_OPS_ARCH_SUPPORTS=y
#BR2_PACKAGE_LIBATOMIC_OPS=y
#BR2_PACKAGE_OPENSSL=y
#BR2_PACKAGE_LIBRESSL=y
#BR2_PACKAGE_LIBRESSL_BIN=y
#BR2_PACKAGE_HAVEGED=y
#BR2_PACKAGE_VEXRISCV_AES=y # Uncomment to enable hardware AES

BR2_RISCV_ABI_ILP32D=y
BR2_RISCV_ISA_CUSTOM_RVC=y

Dolu1990 commented 1 year ago

I still get the same errors with the suggested commit.

Hooo, hmmm I'm trying now to reproduce in simulation (not yet successfull having the bug)

Is the first bug :

[    0.263653] futex hash table entries: 256 (order: , 16384 bytes, linear)                                                                                                                    
[    0.295184] NET: Registered PF_NETLINK/PF_ROUT protocol family                                                                                                                              
[    0.767033] ------------[ cut here ]------------                                                                                                                                            
[    0.767761] WARNING: CPU: X PID: 2 at kernel/rcu/tree.c:279 rcu_core+0x428/0x                                                                                                               
[    0.768925] CPU: 0 PID: p Comm: kthreaddNot tainted5.18.0-rc7 #5                                                                                                                            
[    0.780034] epc : rcu_core+0x428/0x468

Appearing each time you run the system ? or sometime it changes ? It's to know how reproducible and localised the issue appear. To know how random things are.

Dolu1990 commented 1 year ago

Also, thanks for all the file / info ^^

AlanVek commented 1 year ago

It appears each time. The other ones sometimes change depending on the compilation flags (FPU=y/n, RVC=y/n).

Dolu1990 commented 1 year ago

So currently i have simulations running with linux-on-litex-vexriscv ./sim.py --cpu-count 1 --dcache-width 64 --icache-width 64 --dcache-ways 1 --icache-ways 1 --without-coherent-dma --dtlb-size 4 --itlb-size 4 --dcache-size 4096 --icache-size 4096 --with-fpu --with-wishbone-memory --with-rvc with your kernel image. Seems it is fine so far

One thing, about litex-hub/pythondata-cpu-vexriscv_smp@e8ce95b, You need to manualy delete all the VexRiscvLitexSmpCluster_xxxx.v in : https://github.com/litex-hub/pythondata-cpu-vexriscv_smp/tree/master/pythondata_cpu_vexriscv_smp/verilog in order to force their regeneration (sorry i forgot to tell about it)

Also, were you testing with a fresh Litex install ? (To have a idea of the setup ^^)

Dolu1990 commented 1 year ago

Can you send me your dts ? (device tree)

AlanVek commented 1 year ago

I'm generating the .v outside of linux-on-litex-vexriscv, so no problem there! Yes, I have a fresh litex install (maybe one day old).

Here is my dts: dts.zip

I also tried simulating the system with renode and everything worked fine and I was able to boot linux, so I think that the software stack is correctly generated.

Dolu1990 commented 1 year ago

Hmmm, so yeah, probably some hardware issue, the only thing i can think about to track dawn the issue / regression cause would be to test with litex-hub/pythondata-cpu-vexriscv_smp@e8ce95b (and deleting the VexRiscvLitexSmpCluster_xxxx.v in pythondata_cpu_vexriscv_smp/verilog). Thing is i don't have the any smartfusion 2 hardware, also, as it is a quite specific platform, it may be related to that. When you tried with litex-hub/pythondata-cpu-vexriscv_smp@e8ce95b, did you deleted the VexRiscvLitexSmpCluster_xxxx.v ? If not, can you try ? Thanks :D Simulations are still running on my side (no bug so far)

Dolu1990 commented 1 year ago

So i found something, your config use main memory mapped at > 0x80000000 Thing is that currently, VexRiscv is configured to consider everything > 0x80000000 as non cached memory region. So i will restart a simulation with uncached memory region only :)

Dolu1990 commented 1 year ago

Here is the hardcoded uncached memory range specification : https://github.com/SpinalHDL/VexRiscv/blob/master/src/main/scala/vexriscv/demo/smp/VexRiscvSmpLitexCluster.scala

Could be in your case : ioRange = address => address(31 downto 28) < 0xB ||address(31 downto 28) >= 0xD ,

AlanVek commented 1 year ago

Sorry, I forgot to mention that I have the following diff in pythondata-cpu-vexriscv_smp:

diff --git a/src/main/scala/vexriscv/demo/smp/VexRiscvSmpLitexCluster.scala b/src/main/scala/vexriscv/demo/smp/VexRiscvSmpLitexCluster.scala
index 3454577..a58aca4 100644
--- a/src/main/scala/vexriscv/demo/smp/VexRiscvSmpLitexCluster.scala
+++ b/src/main/scala/vexriscv/demo/smp/VexRiscvSmpLitexCluster.scala
@@ -151,8 +151,8 @@ object VexRiscvLitexSmpClusterCmdGen extends App {
       cpuConfigs = List.tabulate(cpuCount) { hartId => {
         val c = vexRiscvConfig(
           hartId = hartId,
-          ioRange = address => address.msb,
-          resetVector = 0,
+          ioRange = address => (address(31 downto 28) === 0x3),//0x3),
+          resetVector = 0xA0000000l,//0xA0000000l,
           iBusWidth = iBusWidth,
           dBusWidth = dBusWidth,
           iCacheSize = iCacheSize,
@@ -321,4 +321,4 @@ object VexRiscvLitexSmpClusterOpenSbi extends App{
       }
     }
   }
-}
\ No newline at end of file
+}

AlanVek commented 1 year ago

I may have found a potential problem. I have the following example code:

int main(){
    volatile uint32_t* v = (volatile uint32_t*)0x30012000;
    while (1){
        *v = *((volatile uint32_t*)0x30012004);
    }
    return 0;
}

What I'm seeing in my simulation is that, when the data_width is 32 bits, then this code translates to what it should: a 32-bit read to address 0x30012004, followed by a 32-bit write to address 0x30012000. But when data_width is 64 bits, then I see a 64-bit read to address 0x30012000, followed by a 32-bit write to address 0x30012000. Which means that if address 0x30012000 has a register that does something after reading it, this could be changing the expected behavior. Which would explain why when data_width is 32 bits, I'm seeing:

[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 65024                                                                                                                    
[    0.000000] Kernel command line: console=liteuart earlycon=liteuart,0x30012000 rootwait root=/dev/ram0

[    9.982495] printk: console [liteuart0] enabled                                                                                                                                             
[    9.982495] printk: console [liteuart0] enabled                                                                                                                                             
[    9.990746] printk: bootconsole [liteuart0] disabled                                                                                                                                        
[    9.990746] printk: bootconsole [liteuart0] disabled

And when data_width is 64 bits, I'm seeing:

[    0.000000] Built  zonelists, mobility grouping o.  Total pages: 65024                                                                                                                      
[    0.0`000] Kernel command line: cons                                                        
[    0.0`000] Unknown kernel command line parameters "con|", will be passed to user space.

[    0.013150] printk: console [ty0] enabled                                                   
0    0.016269] printk: bootconsole [liteuart0] disabled

That would explain the weird behavior for the uart, and could also explain why the kernel fails to boot in case something like that is happening anywhere else.

Dolu1990 commented 1 year ago

Hoo nice :) Hmm, maybe adding : --wishbone-force-32b to the arguments list would help ?

AlanVek commented 1 year ago

The simulation with the basic example goes back to working as expected, I'm synthesizing now with fpu=True to check whether it works in hw. Is this 64-bit bus access "bug" a real bug or is it a feature? I just saw a read transaction that shouldn't be there, but there may be other stuff like that which eventually caused linux to fail at boot.

Thanks again for the help :)

Dolu1990 commented 1 year ago

so, i think it's mostly some missmatch between the Wishbone config from the SpinalHDL side and the litex side. So mostly, a configuration bug.

Probably we should just enfoce the wishbone to always be 32 bits no matter what.

Thanks too ^^

AlanVek commented 1 year ago

With this configuration, I can't even boot the bios.bin. And I think it's got something to do with the fact that I'm now seeing some transactions with SEL=0 in the simulation.

Like this:

Or like this:

Currently, my architecture is just ignoring those accesses and returning ACK immediately without forwarding that transaction to the actual memory, but I don't know if that's what the cpu expects and if that may be what's causing the problems.

AlanVek commented 1 year ago

I got it to work. But when I got to Linux, I got the same errors as in the beginning. Linux starts to boot, but then it gets a lot of errors and it ends with a kernel panic. Seems to be something specific of the FPU.

Dolu1990 commented 1 year ago

And I think it's got something to do with the fact that I'm now seeing some transactions with SEL=0 in the simulation.

Yes, this can happen when a SC (store conditional) is rejected.

Currently, my architecture is just ignoring those accesses and returning ACK immediately without forwarding that transaction to the actual memory, but I don't know if that's what the cpu expects and if that may be what's causing the problems.

That should be good.

So if i understand well, all the memory request are going to the wishbone in your case ? no litedram involved right ? That's something i didn't tried in simulation, i will try. to reproduce.

AlanVek commented 1 year ago

Yes, that's correct. Everything's going to the wishbone and I'm not using the litedram. Along with all the binaries, I could send you the verilog I'm using if it helps to reproduce the problem.

Dolu1990 commented 1 year ago

Sure that could help :)

AlanVek commented 1 year ago

Here is everything:

sources.zip

Dolu1990 commented 1 year ago

Hi, sorry for the delay :/

So i tried on my side on some hardware, and things seems to work fine on my tests. (linux 5.10 and 6.2 are ok with rv32imafdc) Which linux kernel version buildroot was using exactly ? (which git hash)

AlanVek commented 1 year ago

Hello, no worries.

The git hash of the buildroot repo is 61ba55e9cce6884295e47fdf33554e6877bd0747 and the git hash for the linux repo inside buildroot is ae80e67c6b48bbedcd13db753237a25b3dec8301.

Dolu1990 commented 1 year ago

Hi, I'm not successfuly reproducing the issue XD That's a paine XD

Do you have a way to run a simulation of you system until the crash happen ?

litex-hub / linux-on-litex-vexriscv

Fail to boot Linux with FPU enabled #331