sld-columbia / esp

Embedded Scalable Platforms: Heterogeneous SoC architecture and IP integration made easy
Other
327 stars 105 forks source link

Ariane Cache #61

Closed jctullos closed 3 years ago

jctullos commented 3 years ago

Hello,

I saw on the website cache isn't implemented yet for Ariane. Is that still the case? Is there any branches that I could test out that uses cache?

Thank you!

paulmnt commented 3 years ago

Hi @Kendidi. I think the problem is that you have installed Vivado Lab Edition. You can use it to program the FPGA, but the executable is vivado_lab and not vivado, so ESP doesn't find it. Please follow these steps to load the prebuilt image.

First source /home/aie/xilinx/tools/Xilinx/Vivado_Lab/2019.2/settings64.sh

Then copy the prebuilt image files into esp/socs/xilinx-vcu128-xcvu37p. The files you need are, prom.bin, linux.bin and top.bit.

Next, create the generic programming script for Vivado by copying the following into esp/socs/xilinx-vcu128-xcvu37p/vivado/program.tcl (make vivado/program.tcl) should generate the exact same script).

set fpga_host [lindex $argv 0]
set port [lindex $argv 1]
set part [lindex $argv 2]
set bit [lindex $argv 3]

open_hw
connect_hw_server -url $fpga_host:$port
puts "Connected to $fpga_host"
puts "Searching for $part..."

foreach cable [get_hw_targets ] {
    open_hw_target $cable
    set dev [get_hw_devices]
    if [string match -nocase "$part*" $dev] {
        puts "Programming $part ..."
        set_property PROGRAM.FILE $bit $dev
        program_hw_devices $dev
        close_hw_target
        disconnect_hw_server
        close_hw
        exit
    }
    close_hw_target
}

disconnect_hw_server
close_hw
error "ERROR: $part not found at host $fpga_host"

Now, compile esplink with make esplink

Finally, create the following runme.sh script into esp/socs/xilinx-vcu128-xcvu37p/runme.sh where I've replaced vivado with vivado_lab.

#!/bin/bash

vivado_lab -mode batch -quiet -notrace -source vivado/program.tcl -tclargs localhost 3121 xcvu37p top.bit
sleep 5
./esplink --reset
./esplink --brom -i prom.bin
./esplink --dram -i linux.bin
./esplink --reset

Change runme.sh into executable file with chmod +x runme.sh and run the script ./runme.sh

Please let us know if this works. Thank you!

paulmnt commented 3 years ago

@paulmnt

Alright, so an update after building for the VC707:

I'm able to boot a working PMP build on it. So the issue might be in something with the VCU118 and ESP/Open Piton builds. I used the ariane SDK build (which also uses Linux 5.1). The Ariane commit I checked out for a working build that successfully uses initramfs: openhwgroup/cva6@eef5ff6

When I took that same build, added the apbuart to riscv-pk (with no additional changes to their Linux/Buildroot/Busybox config), and boot it with the ESP VCU118, I reach the same area as previously. Stuck at Run /init as init process. So that tells me there's nothing wrong with your Linux/Buildroot/Busybox config, but it's hardware related with an ESP build.

Thank you so much @jctullos for investigating on this and checking that the Linux build we have is not the issue. I have tried ESP VC707 and I get the same kernel panic when the init script runs, so I don't think the problem is related to the board selection. Now, OpenPiton and ESP are fairly different, but they both run an older version of Ariane, therefore if the behavior is similar I think that we need to look into the AXI fields in Ariane and try to understand what changes when PMP is enabled.

jctullos commented 3 years ago

Not a problem! Does the init script actually run though? The big thing is if it loads the initramfs or not. One thing I did see on the vc707, when I disabled the MMC driver in the Linux config, but accidentally kept the SPI and Xilinx SPI driver in there, when the Init script was running it was causing RCU panics. But it at least dropped down to run the init script. Whereas in the VCU118 build, it just sits there and hangs.

This is a picture from the vc707 build when I had the SPI/Xilinx SPI driver activated but not the MMC driver:

image

You can see that the initramfs succeeded because the busybox script for logging starts running. I would be interested if you saw the same thing or not. That might lead to an issue between DDR4 memory on the VCU118 instead, and narrow down your scope.

paulmnt commented 3 years ago

No, I did not see the printout "Starting logging: OK". It panics before that both with VC707 and VCU118 and both with and without ESP cache hierarchy.

This is what I see:

[   17.813970] Segment Routing with IPv6
[   17.828840] sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
[   17.873323] NET: Registered protocol family 17
[   17.917126] Key type dns_resolver registered
[   18.072255] Freeing unused kernel memory: 4156K
[   18.086851] This architecture does not have kernel memory protection.
[   18.106928] Run /init as init process
[   18.135219] init[1]: unhandled signal 11 code 0x2 at 0x0000001555556db0 in ld-2.26.so[1555556000+17000]
[   18.169138] CPU: 0 PID: 1 Comm: init Not tainted 5.1.0-00008-g77bea43 #4
[   18.191720] sepc: 0000001555556db0 ra : 0000000000000000 sp : 0000003fffc30e50
[   18.216298]  gp : ffffffe000952c00 tp : 0000000000000000 t0 : 0000000000000000
[   18.238906]  t1 : 0000000000000000 t2 : 0000000000000000 s0 : 0000000000000000
[   18.261309]  s1 : 0000000000000000 a0 : 0000000000000000 a1 : 0000000000000000
[   18.283693]  a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000
[   18.306194]  a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000000000000
[   18.328551]  s2 : 0000000000000000 s3 : 0000000000000000 s4 : 0000000000000000
[   18.350905]  s5 : 0000000000000000 s6 : 0000000000000000 s7 : 0000000000000000
[   18.373271]  s8 : 0000000000000000 s9 : 0000000000000000 s10: 0000000000000000
[   18.395824]  s11: 0000000000000000 t3 : 0000000000000000 t4 : 0000000000000000
[   18.418174]  t5 : 0000000000000000 t6 : 0000000000000000
[   18.434666] sstatus: 0000000200006020 sbadaddr: 0000001555556db0 scause: 0000000000000001
[   18.468175] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[   18.491388] CPU: 0 PID: 1 Comm: init Not tainted 5.1.0-00008-g77bea43 #4
[   18.511406] Call Trace:
[   18.518923] [<ffffffe000411270>] walk_stackframe+0x0/0xa0
[   18.535158] [<ffffffe00041146c>] show_stack+0x2a/0x34
[   18.550430] [<ffffffe0007db8ce>] dump_stack+0x6c/0x8a
[   18.565678] [<ffffffe000415808>] panic+0xe8/0x226
[   18.579820] [<ffffffe00041776e>] do_exit+0x72c/0x74a
[   18.594709] [<ffffffe00041810c>] do_group_exit+0x2a/0x82
[   18.610735] [<ffffffe00041f742>] get_signal+0xaa/0x54a
[   18.626298] [<ffffffe000410bae>] do_notify_resume+0x58/0x2ea
[   18.643261] [<ffffffe0004101fc>] ret_from_exception+0x0/0xc
[   18.660205] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---
jctullos commented 3 years ago

Yup! I get the same address as well for sbadaddr and scause for the ESP VCU118 build.

paulmnt commented 3 years ago

I am looking into the PMP implementation in Ariane. The objective is to track which new information is exposed (if any) to the bus to "tag" the address accesses. Or, at least, to see if there is any of the recent change is likely trigger it.

Kendidi commented 3 years ago

Thank you @paulmnt very much for your help!! It works. FPGA was programmed, binary files were loaded and Kernel was running all the way to the "esp login" prompt.

Are the pre-built files using the latest source code? I wonder if I build files from scratch, will it be able to boot again or will I encounter the PMP issue. Thanks.

davide-giri commented 3 years ago

That's great news @Kendidi ! The prebuilt for VCU128 uses the latest commit on the master branch of ESP. You won't encounter the PMP issue, because at the moment ESP doesn't point to the latest commit of the Ariane (aka cva6) repository, so the PMP is not included.

Kendidi commented 3 years ago

Cool! So which Ariane repository ESP is pointing to? Thanks.

paulmnt commented 3 years ago

The repo is the same, but it's pointing to an older commit: 465bb209a "wt_cache_subsystem: Fix spelling mistakes (#453)"

Kendidi commented 3 years ago

Thanks @paulmnt !

As an experiment, I tried to build from existing ESP source code on my system and see if the generated files work or not. I issued "make linux". I see prom.bin and linux.bin were updated. I then run "runme.sh" and I got the following panic situation. Any idea what may be an issue? Thanks.

bbl loader                                                                      
[    0.000000] OF: fdt: Ignoring memory range 0x80000000 - 0x80200000           
[    0.000000] Linux version 5.1.0-g77bea430730a-dirty (aie@aie-Machine) (gcc v0
[    0.000000] earlycon: sbi0 at I/O port 0x0 (options '')                      
[    0.000000] printk: bootconsole [sbi0] enabled                               
[    0.000000] initrd not found or empty - disabling initrd                     
[    0.000000] Reserved memory: created DMA memory pool at 0x00000000a0000000, B
[    0.000000] OF: reserved mem: initialized node buffer@A0000000, compatible il
[    0.000000] Zone ranges:                                                     
[    0.000000]   DMA32    [mem 0x0000000080200000-0x000000009fffffff]           
[    0.000000]   Normal   empty                                                 
[    0.000000] Movable zone start for each node                                 
[    0.000000] Early memory node ranges                                         
[    0.000000]   node   0: [mem 0x0000000080200000-0x000000009fffffff]          
[    0.000000] Initmem setup node 0 [mem 0x0000000080200000-0x000000009fffffff] 
[    0.000000] software IO TLB: mapped [mem 0x9b8fd000-0x9f8fd000] (64MB)       
[    0.000000] elf_hwcap is 0x112d                                              
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 128775    
[    0.000000] Kernel command line: earlyprintk console=hvc earlycon=sbi        
[    0.000000] Dentry cache hash table entries: 65536 (order: 7, 524288 bytes)  
[    0.000000] Inode-cache hash table entries: 32768 (order: 6, 262144 bytes)   
[    0.000000] Sorting __ex_table...                                            
[    0.000000] Memory: 441732K/522240K available (3904K kernel code, 233K rwdat)
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1       
[    0.000000] rcu: Preemptible hierarchical RCU implementation.                
[    0.000000] rcu:     RCU event tracing is enabled.                           
[    0.000000]  Tasks RCU enabled.                                              
[    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 10 ji.
[    0.000000] NR_IRQS: 0, nr_irqs: 0, preallocated irqs: 0                     
[    0.000000] plic: mapped 16 interrupts with 1 handlers for 2 contexts.       
[    0.000000] riscv_timer_init_dt: Registering clocksource cpuid [0] hartid [0]
[    0.000000] clocksource: riscv_clocksource: mask: 0xffffffffffffffff max_cycs
[    0.000186] sched_clock: 64 bits at 37MHz, resolution 26ns, wraps every 2199s
[    0.026225] Calibrating delay loop (skipped), value calculated using timer f)
[    0.064851] pid_max: default: 32768 minimum: 301                             
[    0.087278] Mount-cache hash table entries: 1024 (order: 1, 8192 bytes)      
[    0.107545] Mountpoint-cache hash table entries: 1024 (order: 1, 8192 bytes) 
[    0.145088] *** VALIDATE proc ***                                            
[    0.173814] rcu: Hierarchical SRCU implementation.                           
[    0.206259] devtmpfs: initialized                                            
[    0.241034] random: get_random_bytes called from setup_net+0x32/0x150 with c0
[    0.271356] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, ms
[    0.301697] futex hash table entries: 256 (order: 0, 6144 bytes)             
[    0.330616] NET: Registered protocol family 16                               
[    0.535651] usbcore: registered new interface driver usbfs                   
[    0.553781] usbcore: registered new interface driver hub                     
[    0.571738] usbcore: registered new device driver usb                        
[    0.604563] clocksource: Switched to clocksource riscv_clocksource           
[    0.655155] NET: Registered protocol family 2                                
[    0.690728] tcp_listen_portaddr_hash hash table entries: 256 (order: 0, 4096)
[    0.715424] TCP established hash table entries: 4096 (order: 3, 32768 bytes) 
[    0.739335] TCP bind hash table entries: 4096 (order: 3, 32768 bytes)        
[    0.761385] TCP: Hash tables configured (established 4096 bind 4096)         
[    0.783127] UDP hash table entries: 256 (order: 1, 8192 bytes)               
[    0.801400] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes)          
[    0.825383] NET: Registered protocol family 1                                
[    0.850128] RPC: Registered named UNIX socket transport module.              
[    0.868804] RPC: Registered udp transport module.                            
[    0.883043] RPC: Registered tcp transport module.                            
[    0.897252] RPC: Registered tcp NFSv4.1 backchannel transport module.        
[    2.242773] workingset: timestamp_bits=62 max_order=17 bucket_order=0        
[    2.540842] NFS: Registering the id_resolver key type                        
[    2.556731] Key type id_resolver registered                                  
[    2.569418] Key type id_legacy registered                                    
[    2.581776] nfs4filelayout_init: NFSv4 File Layout Driver Registering...     
[    2.793278] io scheduler mq-deadline registered                              
[    2.808436] io scheduler kyber registered                                    
[    4.145684] Serial: GRLIB APBUART driver                                     
[    4.167553] 60000100.uart: ttyS0 at MMIO 0x60000100 (irq = 3, base_baud = 46T
[    4.199767] grlib-apbuart at 0x60000100, irq 3                               
[    4.225054] libphy: Fixed MDIO Bus: probed                                   
[    4.257928] libphy: greth-mdio: probed                                       
[    5.919794] grlib-greth 60080000.greth: assigned reserved memory node buffer0
[    5.956958] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver       
[    5.987725] usbcore: registered new interface driver usbhid                  
[    6.004809] usbhid: USB HID core driver                                      
[    6.040784] NET: Registered protocol family 10                               
[    6.081677] Segment Routing with IPv6                                        
[    6.096058] sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver              
[    6.130435] NET: Registered protocol family 17                               
[    6.148563] Key type dns_resolver registered                                 
[    6.178768] Warning: unable to open an initial console.                      
[    6.197641] Failed to create /dev/root: -2                                   
[    6.211951] VFS: Cannot open root device "(null)" or unknown-block(0,0): err2
[    6.235532] Please append a correct "root=" boot option; here are the availa:
[    6.261619] Kernel panic - not syncing: VFS: Unable to mount root fs on unkn)
[    6.286336] CPU: 0 PID: 1 Comm: swapper Not tainted 5.1.0-g77bea430730a-dirt5
[    6.308415] Call Trace:                                                      
[    6.316041] [<ffffffe00007c06a>] walk_stackframe+0x0/0xa0                    
[    6.332081] [<ffffffe00007c266>] show_stack+0x2a/0x34                        
[    6.347203] [<ffffffe0004323e8>] dump_stack+0x20/0x28                        
[    6.362398] [<ffffffe00007fff2>] panic+0xe2/0x218                            
[    6.376502] [<ffffffe000000e96>] mount_block_root+0x1e2/0x27a                
[    6.393712] [<ffffffe000000fbc>] mount_root+0x8e/0x98                        
[    6.408886] [<ffffffe0000010e0>] prepare_namespace+0x11a/0x164               
[    6.426395] [<ffffffe000000b10>] kernel_init_freeable+0x184/0x1a0            
[    6.444738] [<ffffffe000445dea>] kernel_init+0x12/0xf0                       
[    6.460120] [<ffffffe00007aff0>] ret_from_exception+0x0/0xc                  
[    6.476894] ---[ end Kernel panic - not syncing: VFS: Unable to mount root f-
jctullos commented 3 years ago

@Kendidi

This happened to me earlier, it's due to not running the script build_riscv_toolchain, make sure to run that in the utils/scripts directory. It builds RISCV and buildroot, which builds the initramfs system.

Kendidi commented 3 years ago

Ohh. I see. I will run it. Thanks @jctullos !

davide-giri commented 3 years ago

Correct, that step is part of the setup guide: https://www.esp.cs.columbia.edu/docs/setup/setup-guide/#software-toolchain.

Kendidi commented 3 years ago

I followed the steps from the beginning but have trouble installing the toolchain. Folder "/opt/riscv/bin" was not installed. There is only a "sysroot" folder under "/opt/riscv/". Please advise.

......
......
/usr/bin/install -m 0644 support/misc/target-dir-warning.txt /home/aie/esp/20200929/esp/_riscv_build/buildroot/output/target/THIS_IS_NOT_YOUR_ROOT_FILESYSTEM
>>> skeleton-init-sysv  Extracting
>>> skeleton-init-sysv  Patching
>>> skeleton-init-sysv  Configuring
>>> skeleton-init-sysv  Building
>>> skeleton-init-sysv  Installing to target
rsync -a --ignore-times --exclude .svn --exclude .git --exclude .hg --exclude .bzr --exclude CVS --chmod=u=rwX,go=rX --exclude .empty --exclude '*~' package/skeleton-init-sysv//skeleton/ /home/aie/esp/20200929/esp/_riscv_build/buildroot/output/target/
>>> skeleton  Extracting
>>> skeleton  Patching
>>> skeleton  Configuring
>>> skeleton  Building
>>> skeleton  Installing to target
>>> toolchain-external-custom  Extracting
>>> toolchain-external-custom  Patching
>>> toolchain-external-custom  Configuring
Cannot execute cross-compiler '/opt/riscv/bin/riscv64-unknown-linux-gnu-gcc'
package/pkg-generic.mk:219: recipe for target '/home/aie/esp/20200929/esp/_riscv_build/buildroot/output/build/toolchain-external-custom/.stamp_configured' failed
make: *** [/home/aie/esp/20200929/esp/_riscv_build/buildroot/output/build/toolchain-external-custom/.stamp_configured] Error 1

=== Use the following to load RISCV environment ===
  export PATH=/opt/riscv/bin:$PATH
  export RISCV=/opt/riscv

*** Successfully installed RISCV toolchain to /opt/riscv ***
davide-giri commented 3 years ago

The error you are reporting is the consequence of a previous error, what's the first error you get?

Kendidi commented 3 years ago

OK. Backtracking the toolchian installation process showed that package "bison" was missing. Now it appears something is being installing because the system responses very very slowly and the disk drive LED keeps flashing. Thanks!

Kendidi commented 3 years ago

It went a lot further and then encountered a lot of "g++: internal compiler error: Killed (program cc1plus)".

......
......
../.././riscv-gcc/gcc/c/c-typeck.c: In function ‘void error_init(location_t, const char*)’:
../.././riscv-gcc/gcc/c/c-typeck.c:6182:24: warning: format not a string literal and no format arguments [-Wformat-security]
   error_at (loc, gmsgid);
                        ^
../.././riscv-gcc/gcc/c/c-typeck.c: In function ‘void warning_init(location_t, int, const char*)’:
../.././riscv-gcc/gcc/c/c-typeck.c:6228:43: warning: format not a string literal and no format arguments [-Wformat-security]
   warned = warning_at (exploc, opt, gmsgid);
                                           ^
/bin/bash: line 3: 20195 Killed                  makeinfo --split-size=5000000 --no-split -I . -I ../.././riscv-gcc/gcc/doc -I ../.././riscv-gcc/gcc/doc/include -o doc/gcc.info ../.././riscv-gcc/gcc/doc/gcc.texi
g++: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-7/README.Bugs> for instructions.
../.././riscv-gcc/gcc/collect-utils.c: In function ‘pex_obj* collect_execute(const char*, char**, const char*, const char*, int, bool)’:
../.././riscv-gcc/gcc/collect-utils.c:195:37: warning: format not a string literal and no format arguments [-Wformat-security]
  fatal_error (input_location, errmsg);
                                     ^
g++: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-7/README.Bugs> for instructions.
g++: internal compiler error: Killed (program cc1plus)
g++: internal compiler error: Killed (program cc1plus)
......
......
davide-giri commented 3 years ago

It's possible your system ran out of memory. If you run dmesg right after the errors there may be an "out of memory" message to confirm this hypothesis. What did you specify when the script asked how many threads to use? If my guess is right, running with less threads (like 1 or 2) should solve the issue.

Kendidi commented 3 years ago

Yup, Out of Memory.

I did not specify. I just pressed "ENTER". Any recommended number?

dmesg:

......
......
[47401.477906] [  21846]     0 21846    31645    16687   266240      896             0 cc1plus
[47401.477907] [  21847]     0 21847    27215    10339   233472     1295             0 cc1plus
[47401.477909] [  21849]     0 21849    30955    13893   266240      858             0 cc1plus
[47401.477910] [  21850]     0 21850    28725    13446   241664     1290             0 cc1plus
[47401.477911] [  21855]     0 21855    33902    16320   294912     1265             0 cc1plus
[47401.477912] [  21859]     0 21859    32858    16364   282624      810             0 cc1plus
[47401.477913] [  21862]     0 21862    32170    14521   278528     1886             0 cc1plus
[47401.477915] [  21866]     0 21866    34404    16985   299008     1123             0 cc1plus
[47401.477916] [  21871]     0 21871    30132    13097   266240     1851             0 cc1plus
[47401.477917] [  21872]     0 21872    26377     9385   229376     1105             0 cc1plus
[47401.477918] [  21873]     0 21873    31013    13099   270336     1692             0 cc1plus
[47401.477920] [  21886]  1000 21886    62056      218   417792        2             0 gnome-shell
[47401.477921] [  21941]     0 21941     7613     1333    94208        0             0 as
[47401.477923] [  21951]     0 21951     4190      170    69632        0             0 as
[47401.477924] [  21952]     0 21952     4190      254    73728        0             0 as
[47401.477925] [  21953]     0 21953      311        3    28672        0             0 as
[47401.477926] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/,task=cc1plus,pid=21782,uid=0
[47401.477932] Out of memory: Killed process 21782 (cc1plus) total-vm:145996kB, anon-rss:78640kB, file-rss:8592kB, shmem-rss:0kB, UID:0 pgtables:300kB oom_score_adj:0
[47401.481110] oom_reaper: reaped process 21782 (cc1plus), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[47404.302775] systemd-journald[359]: /dev/kmsg buffer overrun, some messages lost.
davide-giri commented 3 years ago

So my guess was right, there is an "out of memory".

Pressing "ENTER" executes the default option, which is always listed between brackets. Specify 1 this time, that will be slow, but hopefully it will avoid the "out of memory" issue.

Kendidi commented 3 years ago

@davide-giri

Yup. You are right. I tried '4' and everything seems built OK.

In my case (for a freshly installed Ubuntu 18.04 platform), I needed to run the following as well.

sudo apt-get install gawk
sudo apt-get install bison

Thanks a lot for your help!!

Kendidi commented 3 years ago

I run the bbl+Kernel built with ESP today against a bit file built from recent Ariane code and encountered the following.

bbl loader
[    0.000000] OF: fdt: Ignoring memory range 0x80000000 - 0x80200000
[    0.000000] Linux version 5.1.0-g77bea430730a-dirty (aie@aie-ROG-STRIX-Z390-E-GAMING) (gcc vers0
[    0.000000] earlycon: sbi0 at I/O port 0x0 (options '')
[    0.000000] printk: bootconsole [sbi0] enabled
[    0.000000] initrd not found or empty - disabling initrd   
[    0.000000] Zone ranges:                                   
[    0.000000]   DMA32    [mem 0x0000000080200000-0x00000000bfffffff]
[    0.000000]   Normal   empty                               
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000080200000-0x00000000bfffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000080200000-0x00000000bfffffff]
[    0.000000] software IO TLB: mapped [mem 0xbb1fd000-0xbf1fd000] (64MB)
[    0.000000] elf_hwcap is 0x112d
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 258055
[    0.000000] Kernel command line: earlyprintk console=hvc earlycon=sbi
[    0.000000] Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes)
[    0.000000] Inode-cache hash table entries: 65536 (order: 7, 524288 bytes)
[    0.000000] Sorting __ex_table...
[    0.000000] BUG: Bad page state in process swapper  pfn:80c01
[    0.000000] page:ffffffe03f02a038 count:1 mapcount:0 mapping:0000000000000000 index:0x0
[    0.000000] flags: 0x0()
[    0.000000] raw: 0000000000000000 0000000000000100 0000000000000200 0000000000000000
[    0.000000] raw: 0000000000000000 0000000000000000 00000001ffffffff
[    0.000000] page dumped because: nonzero _refcount
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.1.0-g77bea430730a-dirty #7
[    0.000000] Call Trace:
[    0.000000] [<ffffffe000409066>] walk_stackframe+0x0/0xa0
[    0.000000] [<ffffffe000409262>] show_stack+0x2a/0x34
[    0.000000] [<ffffffe0007bf522>] dump_stack+0x20/0x28
[    0.000000] [<ffffffe00045f6dc>] bad_page+0xe0/0xfe
[    0.000000] [<ffffffe00045f732>] free_pages_check_bad+0x38/0x7a
[    0.000000] [<ffffffe00045fd48>] free_pcppages_bulk+0x106/0x38e
[    0.000000] [<ffffffe0004603a4>] free_unref_page_commit.isra.27+0x86/0x8e
[    0.000000] [<ffffffe000460daa>] free_unref_page+0x40/0x54
[    0.000000] [<ffffffe000460dcc>] __free_pages.part.4+0xe/0x22
[    0.000000] [<ffffffe000460ff4>] __free_pages_core+0x94/0xa0
[    0.000000] [<ffffffe000005db6>] memblock_free_pages+0x12/0x1a
[    0.000000] [<ffffffe000008b4e>] memblock_free_all+0x190/0x1f4
[    0.000000] [<ffffffe0000023ee>] mem_init+0x2a/0x38
[    0.000000] [<ffffffe000000848>] start_kernel+0x1cc/0x360
[    0.000000] [<ffffffe000000076>] clear_bss_done+0x3a/0x3e
[    0.000000] Disabling lock debugging due to kernel taint
[    0.000000] BUG: Bad page state in process swapper  pfn:80c02
[    0.000000] page:ffffffe03f02a070 count:1 mapcount:0 mapping:0000000000000000 index:0x0
......
......

The bbl+kernel built yesterday (w/o initramfs system built) did not encounter this. Could this be PMP related or something else? Thanks.

Kendidi commented 3 years ago

The following is resulted from pre-built top.bit, prom.bit and systest.bin downloaded today, plus the linux.bin that I built today with "make linux". Please advise if I have missed any steps that may caused the PANIC? Thanks a lot in advance!

bbl loader
[    0.000000] OF: fdt: Ignoring memory range 0x80000000 - 0x80200000
[    0.000000] Linux version 5.1.0-g77bea430730a-dirty (aie@aie-ROG-STRIX-Z390-E-GAMING) (gcc version 8.3.0 (GCC)) #4 PREEMPT Wed Sep 30 17:35:52 PDT 0
[    0.000000] earlycon: sbi0 at I/O port 0x0 (options '')
[    0.000000] printk: bootconsole [sbi0] enabled
[    0.000000] initrd not found or empty - disabling initrd
[    0.000000] Reserved memory: created DMA memory pool at 0x00000000a0000000, size 1 MiB
[    0.000000] OF: reserved mem: initialized node buffer@A0000000, compatible id shared-dma-pool
[    0.000000] Zone ranges:
[    0.000000]   DMA32    [mem 0x0000000080200000-0x000000009fffffff]
[    0.000000]   Normal   empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000080200000-0x000000009fffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000080200000-0x000000009fffffff]
[    0.000000] software IO TLB: mapped [mem 0x9b8fd000-0x9f8fd000] (64MB)
[    0.000000] elf_hwcap is 0x112d
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 128775
[    0.000000] Kernel command line: earlyprintk console=hvc earlycon=sbi
[    0.000000] Dentry cache hash table entries: 65536 (order: 7, 524288 bytes)
[    0.000000] Inode-cache hash table entries: 32768 (order: 6, 262144 bytes)
[    0.000000] Sorting __ex_table...
[    0.000000] Memory: 438092K/522240K available (3904K kernel code, 236K rwdata, 1161K rodata, 4124K init, 795K bss, 84148K reserved, 0K cma-reserved)
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[    0.000000] rcu: Preemptible hierarchical RCU implementation.
[    0.000000] rcu:     RCU event tracing is enabled.
[    0.000000]  Tasks RCU enabled.
[    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies.
[    0.000000] NR_IRQS: 0, nr_irqs: 0, preallocated irqs: 0
[    0.000000] plic: mapped 16 interrupts with 1 handlers for 2 contexts.
[    0.000000] riscv_timer_init_dt: Registering clocksource cpuid [0] hartid [0]
[    0.000000] clocksource: riscv_clocksource: mask: 0xffffffffffffffff max_cycles: 0x8a60dd6a9, max_idle_ns: 440795204056 ns
[    0.000178] sched_clock: 64 bits at 37MHz, resolution 26ns, wraps every 2199023255540ns
[    0.026255] Calibrating delay loop (skipped), value calculated using timer frequency.. 75.00 BogoMIPS (lpj=375000)
[    0.064848] pid_max: default: 32768 minimum: 301
[    0.087359] Mount-cache hash table entries: 1024 (order: 1, 8192 bytes)
[    0.107624] Mountpoint-cache hash table entries: 1024 (order: 1, 8192 bytes)
[    0.145007] *** VALIDATE proc ***
[    0.173853] rcu: Hierarchical SRCU implementation.
[    0.203860] devtmpfs: initialized
[    0.239443] random: get_random_bytes called from setup_net+0x32/0x150 with crng_init=0
[    0.269934] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
[    0.300275] futex hash table entries: 256 (order: 0, 6144 bytes)
[    0.329137] NET: Registered protocol family 16
[    0.537003] usbcore: registered new interface driver usbfs
[    0.555743] usbcore: registered new interface driver hub
[    0.573741] usbcore: registered new device driver usb
[    0.606923] clocksource: Switched to clocksource riscv_clocksource
[    0.658108] NET: Registered protocol family 2
[    0.693631] tcp_listen_portaddr_hash hash table entries: 256 (order: 0, 4096 bytes)
[    0.718143] TCP established hash table entries: 4096 (order: 3, 32768 bytes)
[    0.742071] TCP bind hash table entries: 4096 (order: 3, 32768 bytes)
[    0.764134] TCP: Hash tables configured (established 4096 bind 4096)
[    0.785864] UDP hash table entries: 256 (order: 1, 8192 bytes)
[    0.804178] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes)
[    0.828071] NET: Registered protocol family 1
[    0.852802] RPC: Registered named UNIX socket transport module.
[    0.871502] RPC: Registered udp transport module.
[    0.885732] RPC: Registered tcp transport module.
[    0.899967] RPC: Registered tcp NFSv4.1 backchannel transport module.
[    8.989749] workingset: timestamp_bits=62 max_order=17 bucket_order=0
[    9.295323] NFS: Registering the id_resolver key type
[    9.311314] Key type id_resolver registered
[    9.323990] Key type id_legacy registered
[    9.336344] nfs4filelayout_init: NFSv4 File Layout Driver Registering...
[    9.549426] io scheduler mq-deadline registered
[    9.564460] io scheduler kyber registered
[   10.924237] Serial: GRLIB APBUART driver
[   10.946223] 60000100.uart: ttyS0 at MMIO 0x60000100 (irq = 3, base_baud = 4687500) is a GRLIB/APBUART
[   10.979224] grlib-apbuart at 0x60000100, irq 3
[   11.004214] libphy: Fixed MDIO Bus: probed
[   11.038504] libphy: greth-mdio: probed
[   12.718269] grlib-greth 60080000.greth: assigned reserved memory node buffer@A0000000
[   12.755499] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[   12.786623] usbcore: registered new interface driver usbhid
[   12.803701] usbhid: USB HID core driver
[   12.840331] NET: Registered protocol family 10
[   12.881028] Segment Routing with IPv6
[   12.894957] sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
[   12.929866] NET: Registered protocol family 17
[   12.948112] Key type dns_resolver registered
[   12.979023] Warning: unable to open an initial console.
[   13.055678] Freeing unused kernel memory: 4124K
[   13.069504] This architecture does not have kernel memory protection.
[   13.088952] Run /init as init process
[   13.290321] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000100
[   13.313279] CPU: 0 PID: 1 Comm: init Not tainted 5.1.0-g77bea430730a-dirty #4
[   13.334544] Call Trace:
[   13.342158] [<ffffffe000409066>] walk_stackframe+0x0/0xa0
[   13.358206] [<ffffffe000409262>] show_stack+0x2a/0x34
[   13.373324] [<ffffffe0007bf522>] dump_stack+0x20/0x28
[   13.388511] [<ffffffe00040cff4>] panic+0xe2/0x218
[   13.402586] [<ffffffe00040e4e4>] do_exit+0x75a/0x778
[   13.417484] [<ffffffe00040ef2c>] do_group_exit+0x28/0x92
[   13.433437] [<ffffffe00040efae>] __wake_up_parent+0x0/0x22
[   13.449906] [<ffffffe000407fde>] ret_from_syscall+0x0/0xe
[   13.466171] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000100 ]---
Kendidi commented 3 years ago

The following is resulted from pre-built top.bit, prom.bit and systest.bin downloaded today, plus the linux.bin that I built today with "make linux". Please advise if I have missed any steps that may caused the PANIC? Thanks a lot in advance!

This issue appears to be related to the changes I made to config file ariane_defconfig. After I reverted the changes and rebuilt linux.bin, it can boot successfully again. Thanks..

Kendidi commented 3 years ago

It appears riscv64-unknown-elf-gdb is not included in the toolchain build. Is it possible to enable it? Thanks.

jctullos commented 3 years ago

@Kendidi Yes, you would just enable it in the build toolchain script, if you go through it you'll see where it takes out GDB build so that it builds faster.

jctullos commented 3 years ago

@paulmnt I'm going to close this so it doesn't continue to build in messages. Sorry for so many different issues to jump into one github issue!

Kendidi commented 3 years ago

@jctullos

Got it. Thank you!

Kendidi commented 3 years ago

@paulmnt I'm going to close this so it doesn't continue to build in messages. Sorry for so many different issues to jump into one github issue!

My bad. Sorry! But appreciate all your help!!

paulmnt commented 3 years ago

Apologies for not managing this earlier. I've created two separate issues for further discussion: #65 and #66