litex-hub / linux-on-litex-rocket

Run 64-bit Linux on LiteX + RocketChip
BSD 2-Clause "Simplified" License
181 stars 18 forks source link

Issue with booting #27

Closed matsbror closed 1 year ago

matsbror commented 1 year ago

I have been able to succesfully build the gateware and boot images following the instructions for the nexys4ddr board. I copy the generated boot.bin, initramfa.cpio, and digilent_nexys4ddr.bit to a USB-stick (also tried SDcard with the same result) but get the following error:

--============== Boot ==================--
Booting from serial...
Press Q or ESC to abort boot completely.
sL5DdSMmkekro
             Timeout
Booting from SDCard in SD-Mode...
Booting from boot.json...
Booting from boot.bin...
SDCard boot failed.
Booting from network...
Local IP: 192.168.1.50
Remote IP: 192.168.1.100
Booting from boot.json...
Booting from boot.bin...
Copying boot.bin to 0x80000000...
Network boot failed.
No boot medium found

--============= Console ================--

litex>

How can I debug this? I am suspecting a mismatch between the boot image and the core somehow. I have attached the entire boot sequence from power on.

I have set the jumpers for booting from USB/SD correctly. It worked with another litex system with a 32-bit processor.

boot.log

gsomlo commented 1 year ago

On Mon, Dec 05, 2022 at 08:35:22AM -0800, Mats Brorsson wrote:

How can I debug this? I am suspecting a mismatch between the boot image and the core somehow. I have attached the entire boot sequence from power on.

With BBL-based LiteX+Rocket, you're expected to build the initramfs cpio archive into the kernel, rather than load it in as a separate file. The initramfs goes inside vmlinuz, which goes inside bbl, resulting in a single boot.bin blob that gets loaded by the LiteX bios (either using tftp, or from the sdcard).

I have set the jumpers for booting from USB/SD correctly. It worked with another litex system with a 32-bit processor.

The jumpers will only help you get the bitstream loaded from sdcard. Once that's done (i.e., the fpga becomes a "computer"), the jumpers should no longer matter, it's down to the LiteX built-in bios from that point forward.

matsbror commented 1 year ago

Thanks, so the loading of the bitstream works then. But apparently the BIOS cannot read from the SD/USB. This is the process I would like to debug. Is there a way to do that? I realise now that even though I can load the FPGA bitstream from the USB-stick, the BIOS can probably not read from the USB (but I got the same problem when I bit the boot.bin on the SDcard.

I will try to send the boot.bin over the serial port instead.

matsbror commented 1 year ago

I just found this that you wrote on your web page "Booting from a µSD card (in SPI mode) is also currently supported.". I did not build the system with --with-spi-sdcard so I will try that.

gsomlo commented 1 year ago

On Mon, Dec 05, 2022 at 12:10:30PM -0800, Mats Brorsson wrote:

I just found this that you wrote on your web page "Booting from a µSD card (in SPI mode) is also currently supported.". I did not build the system with --with-spi-sdcard so I will try that.

I missed the part where you said "usb stick"... And you're right, once the bitstream is loaded, LiteX bios doesn't know how to handle a usb stick (not sure there's support to boot from a usb drive optionally available).

I would recommend --with-sdcard as it's a bit faster in Linux (but spi-sdcard should also work).

Don't forget to build a single unified boot.bin bundling bbl, linux, and the built-in initrd in a single file.

I'm switching to boot.json and individually loading opensbi, initrd, and kernel blobs in newer/future iterations, but the currently published rocket/linux/litex instructions using bbl require a single blob.

Good luck!

matsbror commented 1 year ago

So booting from SDcard still doesn't work. It seems not to recognise the boot image. When I try to boot from serial, it starts but fails with this message:

sL5DdSMmkekro
[LITEX-TERM] Received firmware download request from the device.
[LITEX-TERM] Uploading boot.bin to 0x40000000 (17920552 bytes)...
[LITEX-TERM] Upload calibration... (inter-frame: 10.00us, length: 64)
[LITEX-TERM] Got unexpected response from device 'b'E''

I use this command:

litex_term /dev/ttyUSB1 --serial-boot --kernel boot.bin

In Litex BIOS command line, I can detect the SD-card, but when I try to init it it fails:

litex> sdcard_init
Initialize SDCard... Failed.

This was with a SoC generated with --with-sdcard I am now trying with --with-spi-sdcard

matsbror commented 1 year ago

Ok, I got progress with --with-spi-sdcard

--============== Boot ==================--
Booting from serial...
Press Q or ESC to abort boot completely.
sL5DdSMmkekro
             Timeout
Booting from SDCard in SPI-Mode...
Booting from boot.json...
boot.json file not found.
Booting from boot.bin...
Copying boot.bin to 0x80000000 (17920552 bytes)...
[########################################]
Executing booted program at 0x80000000

--============= Liftoff! ===============--

But it is still tuck at this stage and I will rebuild busybox, the kernel and BBL to see if it helps.

gsomlo commented 1 year ago

On Mon, Dec 05, 2022 at 11:42:43PM -0800, Mats Brorsson wrote:

sL5DdSMmkekro [LITEX-TERM] Received firmware download request from the device. [LITEX-TERM] Uploading boot.bin to 0x40000000 (17920552 bytes)... [LITEX-TERM] Upload calibration... (inter-frame: 10.00us, length: 64) [LITEX-TERM] Got unexpected response from device 'b'E''

I use this command:

litex_term /dev/ttyUSB1 --serial-boot --kernel boot.bin

I don't have any experience loading things over serial. But loading boot.bin to 0x40000000 is wrong, on Rocket it should be 0x80000000.

gsomlo commented 1 year ago

On Tue, Dec 06, 2022 at 12:20:35AM -0800, Mats Brorsson wrote:

Ok, I got progress with --with-spi-sdcard [...] Booting from boot.json... boot.json file not found. Booting from boot.bin... Copying boot.bin to 0x80000000 (17920552 bytes)... [########################################] Executing booted program at 0x80000000

--============= Liftoff! ===============--

But it is still tuck at this stage and I will rebuild busybox, the kernel and BBL to see if it helps.

That looks more promising. My guess is that there's a mismatch between the "computer" and whatever specific parameters were used in building the boot.bin image (wrong details in .dts/.dtb, missing config options in the kernel, etc.)

There are some pre-built binaries in https://github.com/litex-hub/linux-on-litex-rocket/issues/1 Maybe try one of them and see if it works for you -- that might help narrow down which part of your own process needs more tuning...

matsbror commented 1 year ago

The pre-built images did not work booting from an SD-card (because the SoC was not built for SPI, I presume). But thanks to your comment that I must load the boot image at 0x800000, I could specify that to litex_term and I am now loading your boot image serially which is horribly slow. I don't know if it will work.

gsomlo commented 1 year ago

On Tue, Dec 06, 2022 at 06:57:49AM -0800, Mats Brorsson wrote:

The pre-built images did not work booting from an SD-card (because the SoC was not built for SPI, I presume). But thanks to your comment that I must load the boot image at 0x800000, I could specify that to litex_term and I am now loading your boot image serially which is horribly slow. I don't know if it will work.

It's been a while, but IIRC the nexys4ddr bitstream was built using --with-ethernet --with-sdcard (note: not spi-sdcard!).

matsbror commented 1 year ago

It's been a while, but IIRC the nexys4ddr bitstream was built using --with-ethernet --with-sdcard (note: not spi-sdcard!).

Yes, this was the issue I had to start boot at all. The board did not recognize that. Switching to spi-sdcard made it start booting but it's still stuck at Liftoff.

matsbror commented 1 year ago

I have come as far as to figure out that the boot loader gets stuck right after calling the uart_sync() in boot.c.

I have put a printout right before and right after, and this is the printout:

--============= Liftoff! ===============--
Before uart sync
Af

It should have printed After uart_sync instead of just Af .

As far as I can see the uart is set up properly and it can load a boot binary over the serial interface just right (with the same behaviour as when loading the file from the SD card.

Changing to define UART_POLLING got the boot process a little further but it still hangs.

It should be noted that I can boot with the prebuilt bitstream and I am using the prebuilt boot.bin. So there's something wrong in how the SoC is built.

Any pointers, please?

Also, is there a way to change the BIOS without re-building the entire SoC?

gsomlo commented 1 year ago

On Tue, Dec 13, 2022 at 11:51:10PM -0800, Mats Brorsson wrote:

Any pointers, please?

Sounds like you might have the wrong MMIO register address for the UART somewhere in your DTS (check bootargs as well as the actual liteuart node itself).

Also, is there a way to change the BIOS without re-building the entire SoC?

I think there might be a way, but I've never figured it out myself. It probably involves loading the bios over the serial link with litex_term, or something like that. @enjoy-digital might be able to provide us with a better starting point...

matsbror commented 1 year ago

But the BIOS works well. I can interact with it.

matsbror commented 1 year ago

In the generated documentation the UART has address: 0x12004800 while in the .dts file (in the conf directory) it is 0x12006800. I wonder how the bios worked as it goes through the uart as well. I will try to change the dts.

gsomlo commented 1 year ago

On Wed, Dec 14, 2022 at 07:16:42AM -0800, Mats Brorsson wrote:

I wonder how the bios worked as it goes through the uart as well.

The bios uses its own internal hard-coded MMIO addresses to interact with the various peripherals. Linux comes with its own drivers, and uses whatever addresses are provided to it via the DTB.

On typical hardware, the DTB itself is hard-coded inside the bios and will always provide correct register addresses to Linux. That is not (yet) the case with LiteX. There's a litex/tools/litex_json2dts_linux.py helper script that should generate the DTS automatically, but it hasn't yet been adapted to work with Rocket CPUs.

The current status-quo is that one would have to double-check if the MMIO addresses generated during bitstream build (-csr-csv foo.csv) match those in the DTS file, and update them if necessary.

Patching json2dts to teach it about Rocket is an item on my to-do list... :)

matsbror commented 1 year ago

Thanks! That brought me one step further. The output now ends with:

bbl loader
              vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
                  vvvvvvvvvvvvvvvvvvvvvvvvvvvv
rrrrrrrrrrrrr       vvvvvvvvvvvvvvvvvvvvvvvvvv
rrrrrrrrrrrrrrrr      vvvvvvvvvvvvvvvvvvvvvvvv
rrrrrrrrrrrrrrrrrr    vvvvvvvvvvvvvvvvvvvvvvvv
rrrrrrrrrrrrrrrrrr    vvvvvvvvvvvvvvvvvvvvvvvv
rrrrrrrrrrrrrrrrrr    vvvvvvvvvvvvvvvvvvvvvvvv
rrrrrrrrrrrrrrrr      vvvvvvvvvvvvvvvvvvvvvv
rrrrrrrrrrrrr       vvvvvvvvvvvvvvvvvvvvvv
rr                vvvvvvvvvvvvvvvvvvvvvv
rr            vvvvvvvvvvvvvvvvvvvvvvvv      rr
rrrr      vvvvvvvvvvvvvvvvvvvvvvvvvv      rrrr
rrrrrr      vvvvvvvvvvvvvvvvvvvvvv      rrrrrr
rrrrrrrr      vvvvvvvvvvvvvvvvvv      rrrrrrrr
rrrrrrrrrr      vvvvvvvvvvvvvv      rrrrrrrrrr
rrrrrrrrrrrr      vvvvvvvvvv      rrrrrrrrrrrr
rrrrrrrrrrrrrr      vvvvvv      rrrrrrrrrrrrrr
rrrrrrrrrrrrrrrr      vv      rrrrrrrrrrrrrrrr
rrrrrrrrrrrrrrrrrr          rrrrrrrrrrrrrrrrrr
rrrrrrrrrrrrrrrrrrrr      rrrrrrrrrrrrrrrrrrrr
rrrrrrrrrrrrrrrrrrrrrr  rrrrrrrrrrrrrrrrrrrrrr

       INSTRUCTION SETS WANT TO BE FREE

Now it remains to see why it's stuck there.

gsomlo commented 1 year ago

On Wed, Dec 14, 2022 at 11:52:54AM -0800, Mats Brorsson wrote:

Now it remains to see why it's stuck there.

Means BBL is running, but Linux is still unhappy. Might want to try enabling CONFIG_RISCV_SBI_V01 in your kernel .config, I've had reports that it might make a difference.

I haven't used BBL in a while (in fact, I need to update the README file with instructions on how to use opensbi instead). It may be that CONFIG_RISCV_SBI_V01 used to be on by default in riscv defconfig, and was disabled in the interim.

matsbror commented 1 year ago

I enabled that flag, which did not really change anything. However, loading the bitstream from the USB instead of the SD-card made a difference and now it starts to boot Linux but hangs after:

[   15.994106] handlers:
[   15.996340] [<(____ptrval____)>] liteuart_interrupt
[   16.001204] Disabling IRQ #1

I have attached the build log of the bitstream and the full boot log. If you have time to look at it, I would appreciate it very much.
boot.log build.log

gsomlo commented 1 year ago

For some reason, your kernel crashes while attempting to service a UART interrupt. I've recently added IRQ support for LiteUART to https://github.com/litex-hub/linux/tree/litex-rebase, and it's been working better for me than in polling mode (which used to be the only supported mode of operation).

If it's the latest litex-hub/litex-rebase kernel sources you're using, you might try dropping the top-most commit ("serial: liteuart: add irq support"), or commenting out the IRQ configuration from the uart's DTS node: https://github.com/litex-hub/linux-on-litex-rocket/blob/master/conf/nexys4ddr_fpu.dts#L110-L111 to force the uart back into polling-mode.

matsbror commented 1 year ago

I will be trying that.

A question: when I set CONFIG_RISCV_SBI_V01=y I get two questions I do not really know what they are:

Early console using RISC-V SBI (SERIAL_EARLYCON_RISCV_SBI) [N/y/?] (NEW)

and

RISC-V SBI console support (HVC_RISCV_SBI) [N/y/?] (NEW)

What is the proper answer to these?

matsbror commented 1 year ago

I answered y to both of the questions before and with the next to last commit for the linux image, I can now boot to prompt. Thanks for all the help.

gsomlo commented 1 year ago

On Thu, Dec 15, 2022 at 12:31:13PM -0800, Mats Brorsson wrote:

Early console using RISC-V SBI (SERIAL_EARLYCON_RISCV_SBI) [N/y/?] (NEW) and RISC-V SBI console support (HVC_RISCV_SBI) [N/y/?] (NEW)

Yes to both won't hurt -- it allows support for using ecalls into machine mode where BBL itself will send/receive characters from the UART. Instead of interacting with LiteUART itself (via MMIO register reads and writes, and IRQs) the linux kernel can trap into machine mode and let the hypervisor (BBL) deal with console i/o.

You'd have to modify your DTS (bootargs) accordingly, and probably disable LiteUART driver support from Linux altogether as you wouldn't actually need it anymore.

It's a deprecated option, so therefore not future-proof, but could be useful to get something working in a hurry.

xanxuso commented 1 month ago

In the generated documentation the UART has address: 0x12004800 while in the .dts file (in the conf directory) it is 0x12006800. I wonder how the bios worked as it goes through the uart as well. I will try to change the dts.

Hello, @matsbror. I met the same issue with you, can you give me some help, thank you! The printf worked before uart sync, while failed after uart sync. I have checked the uart addresses in the generated csr.csv (0x12005800) csr_base,ctrl,0x12000000,, csr_base,ddrphy,0x12000800,, csr_base,identifier_mem,0x12001000,, csr_base,leds,0x12001800,, csr_base,sdcard_block2mem,0x12002000,, csr_base,sdcard_core,0x12002800,, csr_base,sdcard_irq,0x12003000,, csr_base,sdcard_mem2block,0x12003800,, csr_base,sdcard_phy,0x12004000,, csr_base,sdram,0x12004800,, csr_base,timer0,0x12005000,, csr_base,uart,0x12005800,, and .dts file (0x12005800). ` bootargs = "console=liteuart earlycon=liteuart,0x12005800 rootwait root=/dev/ram0"; ....... liteuart0: serial@12005800 { compatible = "litex,liteuart"; reg = <0x12005800 0x100>;

            status = "okay";
        };`

So I am really confused about that. @gsomlo Did you patch 'json2dts' successfully? Can you give me some help, thank you!

gsomlo commented 1 month ago

On Wed, Jul 17, 2024 at 01:13:04AM -0700, xanxuso wrote:

So I am really confused about that. @gsomlo Did you patch 'json2dts' successfully? Can you give me some help, thank you!

json2dts did get patched to a certain extent, but pasting in the relevant CPU properties from the variant-appropriate pythondata-cpu-rocket sample dts file is still a remaining challenge, afaik.

the csr.csv data is the ultimate authority on the value of SoC MMIO registers, so I'd recommend updating the .dts file to match those numbers.

finally, after the latest update (commit b9b92a5), bbl support has been deprecated in favor of opensbi. nexys-video, ecpix5, and acorn-baseboard are supported, but it should be relatively easy to adapt e.g. nexys-video.dts for nexys-ddr if that's the board you have.

xanxuso commented 1 month ago

Hi, @gsomlo. I have made Linux work on the Rocket SOC. Last week, I have checked all things that about csr.csv and .dts, but there are still no different or strange entries.

While in the other repository -- linux-on-litex-vexriscv, the whole process was very successful. In this repo, a new PLATFORM named 'litex/vexriscv' is applied instead of 'generic'. So I imitate this scheme and create a new platform 'rocket'. I just modified the FW_* in the object.mk and DEFAULT_UART_ADDR/DEFAULT_PLIC_ADDR/DEFAULT_CLINT_ADDR in the platform.c. After re-make, OpenSBI and Linux worked! Here is the log, rocket.log

However, I still have no idea why the platform/generic didn't work. Maybe you can check that and help me figure it out? Thank you!

Most importantly, I want to run Linux on the OpenC906. I have tried the same scheme as Rocket, but it stuck at Liftoff!!. The OpenSBI didn't work again. Can you help? It's also fine to just tell me some important positions to modify. Thank you!

gsomlo commented 1 month ago

On Fri, Jul 26, 2024 at 02:51:39AM -0700, xanxuso wrote:

However, I still have no idea why the platform/generic didn't work.

From the log, it appears you're loading the dtb as a separate, explicit blob. That's probably ok (one valid way to do it). However, with platform/generic, if you follow the new/updated linux-on-litex-rocket instructions, I've been embedding the dtb into the opensbi firmware, and making sure opensbi won't overwrite the loaded kernel with it (using FW_JUMP_FDT_ADDR).

Honestly, it could be anything, there are many "moving parts" and it's hard to imagine what went wrong without the ability to try and do it myself on your specific board model...

Most importantly, I want to run Linux on the OpenC906. I have tried the same scheme as Rocket, but it stuck at Liftoff!!. The OpenSBI didn't work again. Can you help? It's also fine to just tell me some important positions to modify.

The important thing to keep in mind is that Rocket, unlike most (all?) of the other RV64* cpu models supported with LiteX, is built as its own SoC: its several cpu cores are connected together with an internal interconnect (I think they call it TileLink), which also routes memory, MMIO, and DMA over separate externally-facing axi ports, and, in addition, has an internal L1 cache that is kept coherent in the face of all the memory and DMA traffic to/from the outside.

Most other CPU models have a simpler architecture, and simply "dropping one in" as a replacement to Rocket won't be enough to get the same level of functionality out of it, so the answer to this is, unfortunately, "It's Complicated"... :)

Replacing Rocket with another RV64GC litex core, while maintaining the ability to boot Fedora on the resulting LiteX SoC, is an active goal I have on my to-do list. I would strongly prefer that said replacement core be written in [System]Verilog, and not in some Scala-derived meta-language such as Chisel or Spinal, which causes long-term maintainability and bus-factor risks...

xanxuso commented 1 month ago

Hi, @gsomlo. Thank you for your help. I have updated the branch of 'opensbi' repository. Following the instruction with platform/generic, I got Linux work on the Rocket. Maybe there are some bugs in the previous branch. Here is the log rocket.log

The next step, I replaced Rocket with OpenC906. In the simulation environment, it stuck at Boot HART MEDELEG without Linux info. Finally, I added CONFIG_ARCH_THEAD and CONFIG_ERRATA_THEAD in the linux_defconfig. After remake, Linux worked. Here is the log. c906_sim.log

However, when I loaded bitstream to my board, it failed at f_mount(&fs, "", 1) in the sdcardboot step. c906.log But the board and the sacard is the same as Rocket used, which were verified. I added SDCARD_DEBUG and some printf info in the ff.c/mount_volume. The 'fmt = find_volume(fs, LD2PT(vol));' returns fmt=3, which means "No FAT volume is found". But I have no idea why that, can you give me some suggestions? Thank you!

gsomlo commented 1 month ago

On Tue, Jul 30, 2024 at 03:11:24AM -0700, xanxuso wrote:

can you give me some suggestions?

Can you boot from the sdcard (as opposed, to e.g., netboot)? If yes, this may be worth pursuing further. If not, chances are that DMA isn't working properly (that whole dedicated DMA port and pass-through routing I was talking about that Rocketchip does)...

xanxuso commented 1 month ago

Hello, @gsomlo. I can boot from serial or net. Here are the logs. c906_netboot.log c906_serialboot.log However, when booting from sdcard, it still fails. As mentioned last time, rocket boot from sdcard successfully.

If not, chances are that DMA isn't working properly (that whole dedicated DMA port and pass-through routing I was talking about that Rocketchip does)

Can you explain it in detail or teach me where should be modified? Thank you very much!

gsomlo commented 1 month ago

Can you explain it in detail or teach me where should be modified ?

Here's how Rocket (and, generally, CPU "assemblies" that internally route DMA are wired into the overall LiteX SoC:

        --------------        ------------
        |        mem |<------>| LiteDRAM |
        |   Rocket   |        ------------
        | mmio   dma |    
        --------------
           /\     ^
  LiteX WB ||     | Dedicated DMA
     Bus   \/     v     Bus
        ------------------
        | periph. (e.g., |
        | sdcard, sata,  |
        | etc.)          |
        ------------------

In contrast, something like e.g., VexRiscV (and other cores that don't directly get involved in routing DMA) are wired up like this:

--------------
|            |  ------------
|  VexRiscV  |  | LiteDRAM |
|   i, d     |  ------------
--------------    ^
    ^  ^          |
    |  |          |
    v  v          v
========================= Common LiteX Wishbone Bus
   ^      ^
   |      |
   v      v
------------------
| periph. (e.g., |
| sdcard, sata,  |
| etc.)          |
------------------

The LiteX bus routes the CPU's instruction fetch and data read/write requests to the appropriate target (peripheral vs. main RAM), and also routes peripheral DMA master requests directly to LiteDRAM (which acts as the DMA slave). When DMA happens, if the CPU core has any internal cache, it will fall out of coherence with LiteDRAM, and will need to be flushed explicitly to avoid weird conflicts.

The place where this "wiring up" happens is here and here, and the place where rocket declares itself capable of internally routing mmio vs. memory vs. dma traffic separately over dedicated ports is here.

My advice would be to find an existing CPU model that is closest (in terms of one or the other of the above "wiring diagrams") to what you're trying to use, and ensure DMA-capable peripherals are also wired up appropriately, similarly to the existing model you're using as a starting point.

xanxuso commented 4 weeks ago

Thank you for your explanation, @gsomlo! Sorry to trouble you about another question. When I build the opensbi firmware for Rocket with FW_TEXT_START=0x8000_0000, it stuck at Liftoff. While with FW_TEXT_START=0x0(default value), it worked and the SBI information printed "Firmware Base : 0x80000000", which means scratch->fw_start = 0x80000000. In fw_base.S, _scratch_init store _fw_start to scratch->fw_start:

lla a4, _fw_start
sub a5, t3, a4
REG_S   a4, SBI_SCRATCH_FW_START_OFFSET(tp)
REG_S   a5, SBI_SCRATCH_FW_SIZE_OFFSET(tp)

However, in fw_base.ldS, _fw_start should equal FW_TEXT_START=0x0:

. = FW_TEXT_START;
/* Don't add any section between FW_TEXT_START and _fw_start */
PROVIDE(_fw_start = .);

In the fw.md, FW_TEXT_START' defination is "the compile time address of the OpenSBI firmware". What's the difference between it and excuting address. That's not all, FW_JUMP_ADDR should be 0x80200000. If _fw_start=0x0, it cannot be established.

fw_next_arg1:
#ifdef FW_JUMP_FDT_ADDR
    li  a0, FW_JUMP_FDT_ADDR
#elif defined(FW_JUMP_FDT_OFFSET)
    lla a0, _fw_start
    li  a1, FW_JUMP_FDT_OFFSET
    add a0, a0, a1
#else
    add a0, a1, zero
#endif

So, is there something wrong with my understanding? Maybe _fw_start don't equal FW_TEXT_START? I will appreciated it if you can answer above question, thank you.

gsomlo commented 3 weeks ago

On Thu, Aug 08, 2024 at 03:04:22AM -0700, xanxuso wrote:

Thank you for your explanation, @gsomlo! Sorry to trouble you about another question. When I build the opensbi firmware for Rocket with FW_TEXT_START=0x8000_0000, it stuck at Liftoff. While with FW_TEXT_START=0x0(default value), it worked and the SBI information printed "Firmware Base : 0x80000000", which means scratch->fw_start = 0x80000000. In fw_base.S, _scratch_init store _fw_start to scratch->fw_start:

lla a4, _fw_start sub a5, t3, a4 REG_S a4, SBI_SCRATCH_FW_START_OFFSET(tp) REG_S a5, SBI_SCRATCH_FW_SIZE_OFFSET(tp)

However, in fw_base.ldS, _fw_start should equal FW_TEXT_START=0x0:

. = FW_TEXT_START; / Don't add any section between FW_TEXT_START and _fw_start / PROVIDE(_fw_start = .);

In the fw.md, FW_TEXT_START' defination is "the compile time address of the OpenSBI firmware". What's the difference between it and excuting address. That's not all, FW_JUMP_ADDR should be 0x80200000. If _fw_start=0x0, it cannot be established.

fw_next_arg1:

ifdef FW_JUMP_FDT_ADDR

    li      a0, FW_JUMP_FDT_ADDR

elif defined(FW_JUMP_FDT_OFFSET)

    lla     a0, _fw_start
    li      a1, FW_JUMP_FDT_OFFSET
    add     a0, a0, a1

else

    add     a0, a1, zero

endif

So, is there something wrong with my understanding? Maybe _fw_start don't equal FW_TEXT_START? I will appreciated it if you can answer above question, thank you.

I'm not an expert on opensbi internals. If you look at how it's built in the README, you'll see the only change (override) of the generic platform's defaults is FW_JUMP_FDT_ADDR=0x82400000, which tells opensbi to place its own re-processed version of the DTB at that address, and thus avoids overwriting one of the other blobs (initrd or kernel image), which are loaded according to boot.json.

For the simple busybox scenario, boot.json looks like this:

{ "initrd_bb": "0x82000000", "Image": "0x80200000", "fw_jump.bin": "0x80000000" }

That means initrd can go from 0x82000000 all the way to 0x82400000 (0x400000 or 4194304 bytes, which is OK given the one I use is 826559 bytes).

When booting fedora, boot.json looks like this:

{ "initramfs.img": "0x83000000", "Image": "0x80200000", "fw_jump.bin": "0x80000000" }

That means the kernel (much larger in Fedora than the custom one I build for the busybox scenario) can go from 0x80200000 through 0x82400000, i.e. have a size of up to 0x2200000 or 35651584 bytes).

In either case, opensbi (with default-everything except for FW_JUMP_FDT_ADDR) is expected to be loaded into RAM at 0x80000000, and the litex bios should start executing it at that same (0x80000000) address. Based on the original DTB that we build into opensbi during compilation (and which it will process and clone to FW_JUMP_FDT_ADDR before pointing the kernel at it), opensbi will know where the kernel and initrd are in ram, and ensure they start properly.

If you want to know more than the above about opensbi, I suggest emailing their list or opening an issue on their github -- the above is most of what I know, and really all I needed to know about opensbi... :)