Closed ghost closed 5 years ago
About atomics, there is some support in VexRiscv to provide LR/SC in a local way, it only work for single CPU systems.
Yeah, "dummy" implementations that work on single CPU systems should be perfectly fine.
As discussed at Free Silicon Conference together with @Dolu1990 , we are also working on it here: https://github.com/enjoy-digital/litex/issues/134.
We can continue the discussion here for the CPU aspect. @daveshah1: i saw you made some progress, just for info @Dolu1990 is ok to help getting things working. So it you see strange things or need help on things related to Spinal/Vexriscv, you can discuss your findings here.
My current status is that I have made quite a few hacks to the kernel, vexriscv and LiteX, but I'm still only just getting into userspace and not anywhere useful yet.
VexRiscv: https://github.com/daveshah1/VexRiscv/tree/Supervisor Build config: https://github.com/daveshah1/VexRiscv-verilog/tree/linux LiteX: https://github.com/daveshah1/litex/tree/vexriscv-linux kernel: https://github.com/daveshah1/litex-linux-riscv
@Dolu1990 I would be interested if you could look at 818f1f68686c75d7ee056d0da1843b98ade4b622 - loads were always reading 0xffffffff from virtual memory addresses when bit 10 of the offset (0x400) was set. This seems to fix it, but I'm not sure if a better fix is possible
As it stands, the current issue is a kernel panic "Oops - environment call from S-mode" shortly after init
starts. It seems after a few syscalls it either isn't returning properly to userspace, or a spurious ECALL is accidently triggered while in S-mode (it might be the ECALL getting "stuck" somewhere and lurking, so what should be an IRQ triggers the ECALL instead)
Hi @daveshah1 @enjoy-digital :D
So, for sure we will hit bugs in VexRiscv, as only the machine mode was properly tested. Things not tested enough in VexRiscv which could have bugs :
I think the best would be to setup a minimal test environnement to run linux on. It would save us a lot of time and sanity. Especialy for a linux port project :D So, to distinguish hardware bugs from software bugs my proposal is that i setup a minimalistic environnement where only the VexRiscv CPU is simulated and compared against a instruction syncronised software model of the CPU (I already have one which do that, but CSR are missing from it) This would point exactly when the hardware is diverging from what it should do, and bring serenity in the developpement ^.^
Does that sound good for you ?
That sounds very sensible! The minimal peripheral requirement is low, just a timer (right now I have the LiteX timer connected to the timerInterruptS pin, and hacked the kernel to directly talk to that rather than the proper SBI route to setting up a timer) and a UART of some kind.
My only concern with this is speed, right now it is taking about 30s on hardware at 75MHz to get to the point of failure. So definitely want to use Verilator and not iverilog...
I can setup easily a verilator simulation. But 30s on hardware at 75MHz will still be a bit slow: we can expect 1MHz execution speed so that's still around 40 min...
I did just manage to make a bit of progress on hardware (perhaps this talk of simulators is scaring it into behaviour :smile:)
It does reach userspace successfully, so we can almost say Linux is working. If I set /bin/sh as init, then I can even use shell builtins - being able to run echo hello world
counts as Linux, right? (but calls to other programs don't seem to work). init itself is segfaulting deep within libc, so there's still something fishy, but could just be a dodgy rootfs.
@daveshah1 this is great. The libc segfault happened also in our REnode (https://github.com/renode/renode) emulation. Can you share the rootfs you're using?
This is the initramdisk from antmicro/litex-linux-readme with a small change to inittab to remove some references to files that don't exist
In terms of other outstanding issues, I also had to patch VexRiscv so that interrupts are routed to S-mode rather than M-mode. This broke the LiteX BIOS which expects M-mode interrupts, so I had to patch that to not expect interrupts at all, but that means there is now no useful UART output from the BIOS. I think a proper solution would be to select interrupt privilege dynamically somehow.
We had to fix/workaround irq delegates. I think this code should be in our repo, but I'll check that again.
The segfault I see is:
[ 53.060000] getty[45]: unhandled signal 11 code 0x1 at 0x00000004 in libc-2.26.so[5016f000+148000]
[ 53.070000] CPU: 0 PID: 45 Comm: getty Not tainted 4.19.0-rc4-gb367bd23-dirty #105
[ 53.080000] sepc: 501e2730 ra : 501e2e1c sp : 9f9b2c60
[ 53.080000] gp : 00120800 tp : 500223a0 t0 : 5001e960
[ 53.090000] t1 : 00000000 t2 : ffffffff s0 : 00000000
[ 53.090000] s1 : 00000000 a0 : 00000000 a1 : 502ba624
[ 53.100000] a2 : 00000000 a3 : 00000000 a4 : 000003ef
[ 53.100000] a5 : 00000160 a6 : 00000000 a7 : 0000270f
[ 53.110000] s2 : 502ba5f4 s3 : 00000000 s4 : 00000150
[ 53.110000] s5 : 00000014 s6 : 502ba628 s7 : 502bb714
[ 53.120000] s8 : 00000020 s9 : 00000000 s10: 000003ef
[ 53.120000] s11: 00000000 t3 : 00000008 t4 : 00000000
[ 53.130000] t5 : 00000000 t6 : 502ba090
[ 53.130000] sstatus: 00000020 sbadaddr: 00000004 scause: 0000000d
The bad address (0x73730 in libc-2.26.so) seems to be in _IO_str_seekoff
, the disassembly around it is:
73700: 00080c93 mv s9,a6
73704: 00048a13 mv s4,s1
73708: 000e0c13 mv s8,t3
7370c: 000d8993 mv s3,s11
73710: 010a0793 addi a5,s4,16
73714: 00000d93 li s11,0
73718: 00000e93 li t4,0
7371c: 00800e13 li t3,8
73720: 3ef00d13 li s10,1007
73724: 02f12223 sw a5,36(sp)
73728: 04092483 lw s1,64(s2)
7372c: 71648463 beq s1,s6,73e34 <_IO_str_seekoff@@GLIBC_2.26+0x41bc>
73730: 0044a783 lw a5,4(s1)
I checked the code, and it looks like all has been pushed to github.
As for the segfault: Note that we had to re implement the mapping code in Linux + there are some hacks in the Vex MMU itself. This could be reason of the segfault as user space starts using the virtual memory very extensively.
For example the whole kernel memory space is mapped directly and we bypass the MMU translation maps see: https://github.com/antmicro/VexRiscv/blob/97d04a5243bbfee9d1dfe56857f3490da9fe1091/src/main/scala/vexriscv/plugin/MemoryTranslatorPlugin.scala#L116
the kernel range is defined in MMU plugin instance: https://github.com/antmicro/VexRiscv/blob/97d04a5243bbfee9d1dfe56857f3490da9fe1091/src/main/scala/vexriscv/TestsWorkspace.scala#L98
I'm pretty sure there are many bugs hidden there :)
Ok, I will think about the best way and how exactly setup that test environnement with the syncronised software golden model (to get max speed). About the golden model, i will complet it (MMU part). But then about the CSR i can do it too, but probably the best would be that somebody else than me cross check my interpretation of the privileged spec, because if both the hardware and the software golden model implement the same wrong interpretation, that's not so helpfull ^^.
@enjoy-digital Maybe we can keep the actual regression test environnement of VexRiscv, and just complet it with the required stuff. It's a bit dirty, but it should be fine. https://github.com/SpinalHDL/VexRiscv/blob/master/src/test/cpp/regression/main.cpp
The golden model is currently there https://github.com/SpinalHDL/VexRiscv/blob/master/src/test/cpp/regression/main.cpp#L193
@Dolu1990: in fact i already have the verilator simulation that is working fine, just need improve it a little bit load more easily the vmlinux.bin/vmlinux.dtb and initramdisk to ram. But yes, we'll use what it more convenient for you. I'll look at the your regression env and your golden model.
@enjoy-digital Can you show me the verilator testbench sources :D ?
@kgugala Which CPU configuration are you using, can you show me ? (The test workspace you pointer isn't using caches nor MMU)
The config I am using is at https://github.com/daveshah1/VexRiscv-verilog/blob/linux/src/main/scala/vexriscv/GenCoreDefault.scala (which has a few small tweaks compared to @kgugala's, to skip over FENCEs for example).
@enjoy-digital The checks between the golden model and the RTL are :
It should be enough to find out divegences fast.
@daveshah1 Jumping over Fence instruction is probably fine for the moment. But jumping over iFence instruction isn't. There is no cache coherency between the instruction cache and the data cache.
Need to use the caches fluch :) Is that used by some ways ?
(Memory coherency issues is something which is automaticaly catched by the golden model / RTL cross checkes)
As it stands it looks like all the memory has been set up as IO, which I suspect means the L1 caches won't be used at all - I think LiteX provides a single L2 cache.
Indeed, to get useful performance proper use of caches and cache flushes will be needed.
yes, we disabled the caches as they were causing a lot of troubles. It didn't make sense to fight both MMU and caches at the same time
@daveshah1 Ok ^^ One thing to know, is the instruction cache do not support IO instruction fetch, instead it cache them. (Supporting IO instruction fetch cost area, and isn't realy a usefull think, as far i know ?) So you still need to flush the instruction cache in iFence. It could be done easily.
@kgugala The cacheless plugins aren't aware about the MMU. I perfectly understand your point about avoiding the trouble of both at once. So my proposal, is :
To the roadmap would be :
TBH the real long term solution will be to reimplement the MMU so it is fully compliant with the spec. Then we can get rid of the custom mapping code in Linux and restore the original mainline memory mapping code used for RV64.
I'm aware this will require quite significant amount of work in Vex itself.
I don't think it would require that much work. MMU is a relatively easy piece of hardware. I have to think about he heavyness in term of FPGA area of a fully compliant MMU.
But what is the issue of a software refilled MMU ? If it use the machine mode to do it, it became transparent to the linux kernel right ? So no linux kernel modification required, but just a piece of machine mode code to have in addition of the raw Linux port :) ?
Yes, I think an M-mode trap handler is the proper solution. We can probably use it to deal with any missing atomic instructions too.
(troll on) We should not forget the ultimate goal : RISC-V linux on ice40 1K, i'm sure #28 would agree ^.^ (troll off)
It just may be difficult to push the custom mapping code to Linux' mainline
The trap handler need not sit in Linux at all, it can be part of the bootloader.
@kgugala By mapping you mean the different flags of each MMU TLB of VexRiscv (https://github.com/SpinalHDL/VexRiscv/blob/master/src/main/scala/vexriscv/plugin/MemoryTranslatorPlugin.scala#L51) ? If the given feature aren't enough, i'm happy to fix that in the first place
@daveshah1 yes, it can. But that makes things even more complicated as two pieces of software will have to be maintained.
@Dolu1990 the flags were sufficient. One of the missing part is variable map size. AFAIK right now you can map only 4k pages. This made mapping of the whole kernel space impossible - the MMU's map table is to small to fit so many 4k entries. This is the reason we added this constant kernel space mapping hack. Also, in user space, there are many mappings for different contexts. Those mappings are switched very often, so rewriting those every time with 2 custom instructions for every 4k page is very slow.
We haven't tested properly if the reloading is done properly, and if the mappings are refreshed correctly in the MMU itself. This, IMO, is the reason of a segfault we're seeing in user space.
@kgugala the initial idea tohandle pages bigger than 4KB was to just translate them on demand to 4KB ones in the TLB For example
Access at vitual address 0x1234568, via a 16 MB page which map 0x12xxxxxx to 0xABxxxxxx => Software emulation which add in the TLB cache a 4KB TLB which map 0x12345xxx to 0xAB345xxx
But now that i think about it, maybe the support of 16MB pages can be added for very few hardware addition over the exisiting solution.
The software model should also be able to indirectly pick up MMU translation errors :)
@Dolu1990: the simulation source is here: https://github.com/enjoy-digital/litex/blob/master/litex/utils/litex_sim.py and https://github.com/enjoy-digital/litex/tree/master/litex/build/sim
With a vmlinux.bin with the .dtb appended, we can run linux with on mor1kx with:
litex_sim --cpu-type=or1k --ram-init=vmlinux.bin
For now for Vexriscv, i was hacking the ram initialization function to aggregate the vmlinux.bin, vmlinux.dtb and initramdisk.gz, but i'll thinking about using a .json file to describe how the ram needs to be initialized:
{
"vmlinux.bin": 0x00000000,
"vmlinux.dtb": 0x01000000,
"initramdisk.gz": 0x01002000,
}
and then just do:
litex_sim --cpu-type=vexriscv --ram-init=ram_init_linux.json
The software right now maps the pages on demand.
@Dolu1990 The problem is that kernel space has to be mapped for the whole time. The whole kernel runs in S mode in virtual memory. This space cannot be unmapped, because any interrupt/exception (including TLB miss) etc may happen at any time. We cannot end up in a situation where TLB miss causes jump to a handler, which is not mapped at the moment causing another TLB miss. This would end up in terrible miss->handler->miss loop
I have the userspace segfault issue seemingly fixed!
The problem was that the mapping code in the kernel was always mapping pages as RWX. But the kernel relies on pages being mapped read-only and triggering a fault on writes (e.g. for copy-on-write optimisations). Fixing that, and hacking the DBusCached plugin so that all write faults trigger a store page fault exception (the store fault exception was going to M-mode and causing problems, need to look into correct behaviour here), seems to result in a reliable userspace.
@enjoy-digital Ahh, ok, so it is a SoC level simulation. I think the best would realy be to stick to a raw CPU simulation in Verilator, to realy keep a full control over the CPU, and keep it raw nature, and keep simulation performance as high as possible to reduce sim time.
@kgugala This is the purpose of Machine mode emulation. Basicaly, in machine mode, the MMU translation is off, and the cpu can do all sort of things, without the supervisor mode even being able to notice it.
There is the schedule of a user space TLB miss :
Use space TLB miss
It trigger a machine mode exception
The machine mode MMU software refiller check the TLB in the main memory
If there is a memory TLB existing, it refill the hardware MMU and return into user mode without supervisor even knowing
If there was no memory TLB to map required access, it emulate a supervisor exception and return the execution to the supervisor.
@daveshah1 this is awesome
@daveshah1 Great :D
What do you think about no-MMU support for Linux on RISC-V? Would it be possible? That would require hacking the kernel, instead of VexRiscv, of course.
Awesome @daveshah1!
liteeth is working too! Although the combination of lack of caching and expensive context switches means this takes the best part of a minute...
@daveshah1 on what platform do you run it? Do you run it with the ramdisk you shared before? I tried to run it and it seems to be stucking at:
[ 0.000000] RAMDISK: gzip image found at block 0
I boot linux commit d27b7d5cb658ccb9ade4bea6a12feb08ebdcc541
Reuploading ramdisk just in case, but don't think there have been any changes.
The kernel requires the LiteX timer to be connected to the VexRiscv timerInterruptS, and the cycle/cycleh CSRs to work. ime 'stuck during boot' has generally been timer-related problems.
My platform:
This must be the timer interrupt then. I'll add this to my test platform
Oh, I see you run it with the latest Litex. I tried it on the system we used for the initial work (from December 2018). I have to rebase our changes
I bumped all the parts and have it running on Arty :)
Awesome! I just pushed some very basic kernel-mode emulation of atomic instructions, which has improved software compatibility a bit (the current implementation I've done isn't actually atomic yet, as it ignores acquire/release for now...)
@Dolu1990 If I were to use RiscvGolden as you have suggested, would I run it with
VexRiscv/src/test/cpp/regression$ make DEBUG_PLUGIN_EXTERNAL=yes
Then connect openocd with
openocd$ openocd -c "set VEXRISCV_YAML cpu0.yaml" -f tcl/target/vexriscv_sim.cfg
Then load vmlinux, dtb and initrd over gdb. I just want to make sure to use it as expected.
My intention with creating this issue is collecting/sharing information and gauging interest about running Linux on VecRiscv. From what I know, VexRiscv is still missing functionality, and it won't work out of the box.
A big problem is the MMU. Ideally, "someone" will hopefully write patches to add no-MMU support to Linux/RISC-V, but currently, a MMU is required. It appears VexRiscv has a partial MMU implementation using a software-filled TLB. There needs to be machine mode to walk the page tables and fill the TLBs, and I didn't find a reference implementation of that.
Another issue are atomics. Linux requires them currently. There seems to be partial support present in VexRiscv (a subset or so). Another possibility is patching the kernel not to use atomics if built without SMP support. There's also the question how much atomics support userspace typically requires.
Without doubt there are more issues that I don't know about.
Antmicro apparently made a Linux port: https://github.com/antmicro/litex-rv32-linux-system https://github.com/antmicro/litex-linux-riscv I didn't know about this before and haven't managed to build the whole thing yet. Unfortunately, their Linux kernel repository does not include the git history. Here's a diff against the apparent base: https://0x0.st/z-li.diff
Please post any other information you know.