LekKit / RVVM

The RISC-V Virtual Machine
GNU General Public License v3.0
885 stars 62 forks source link

Haiku OS guest support #61

Open LekKit opened 1 year ago

LekKit commented 1 year ago

Milestones, progress

LekKit commented 1 year ago

@X547 As far as I see, NVMe IRQ loss doesn't happen when running your Haiku build with ATA & haiku_loader.riscv. It doesn't timeout or anything, ATA & NVMe coexist happily, both drive partitions are enumerated, no complaints about NVMe polling, etc. Perhaps there are some non-upstream changes that fix it. There is a small issue with haiku_loader.riscv however. If ATA isn't the first device on the PCI bus, it crashes, i.e. NVMe cannot be attached before ATA right now.

Another interesting thing, EFI framebuffer seems to work properly now with nightly images & new U-Boot. I don't know what fixed it, I tried older RVVM commits but it still works... Maybe it was just a fluke or some local issue, huh.

Would be happy if you can verify these.

u-boot.bin.zip

image

X547 commented 1 year ago

If ATA isn't the first device on the PCI bus, it crashes, i.e. NVMe cannot be attached before ATA right now.

ATA MMIO address is currently hardcoded both in boot loader and kernel. Kernel ATA driver need refactor because it currently assumes that register addresses are 16 bit.

LekKit commented 1 year ago

If ATA isn't the first device on the PCI bus, it crashes, i.e. NVMe cannot be attached before ATA right now.

ATA MMIO address is currently hardcoded both in boot loader and kernel. Kernel ATA driver need refactor because it currently assumes that register addresses are 16 bit.

I hope we can just ignore all of this and get NVMe running instead. Where can I find haiku_loader.riscv sources? I could try writing a simple NVMe driver, why not (Fine deal imo, since you're working on I2C HID).

X547 commented 1 year ago

Where can I find haiku_loader.riscv sources? I could try writing a simple NVMe driver, why not (Fine deal imo, since you're working on I2C HID).

It is here: https://github.com/haiku/haiku/blob/master/src/system/boot/platform/riscv/devices.cpp#L33. It is needed to add some NvmeBlockDevice. I have an unpublished boot loader PCI bus code.

X547 commented 1 year ago

My Haiku RVVM branch: https://github.com/X547/haiku/tree/rvvm2.

WIP NVMe boot loader driver: https://github.com/X547/haiku/blob/e717045595ebbd71a30731bc57c96a5d1a68ef52/src/system/boot/platform/riscv/NvmeBlockDevice.cpp.

X547 commented 1 year ago

ATA MMIO address hardcode:

LekKit commented 1 year ago

I'm not sure I properly understand how to build Haiku ../../buildtools/jam/jam0 -j16 -q @minimum-mmc

Asked for riscv target boot platform 
Unknown path to handle adding to image 
don't know how to make @minimal-mmc
...patience...
...found 1 target(s)...
...can't find 1 target(s)...
X547 commented 1 year ago

I'm not sure I properly understand how to build Haiku

You need to configure build first. It will build GCC for riscv64 target. Assuming that current directory contains haiku and buildtools.

mkdir -p generated.riscv64
cd generated.riscv64
../configure -j4 --build-cross-tools riscv64 --cross-tools-source ../../buildtools --distro-compatibility official
X547 commented 1 year ago

../../buildtools/jam/jam0 -j16 -q @minimum-mmc don't know how to make @minimal-mmc

Spell miss? Correct is @minimum-mmc.

LekKit commented 1 year ago

You need to configure build first

I did that already using this guide https://www.haiku-os.org/guides/building/compiling-riscv64

Spell miss? Correct is @minimum-mmc.

I tried many, none worked (With the same error)

LekKit commented 1 year ago

Improved upstream ATA in 43aeba3, at least it's no longer a security hellhole (3 CWEs fixed, lol). Merged your API changes so you no longer need to hack on it each time.

Is it worth adding some kind of -ata option for those drives in upstream?

X547 commented 1 year ago

I tried many, none worked (With the same error)

https://www.haiku-os.org/guides/building/pre-reqs

<jam-install-command>

To install jam you can use one of two commands: The first requires administrative privilege, as jam will be installed to ‘/usr/local/bin/’

    sudo ./jam0 install
    ./jam0 -sBINDIR=$HOME/bin install
X547 commented 1 year ago

Is it worth adding some kind of -ata option for those drives in upstream?

Ideally it will be nice to have an option to specify drive type for each image independently.

LekKit commented 1 year ago

Is it worth adding some kind of -ata option for those drives in upstream?

Ideally it will be nice to have an option to specify drive type for each image independently.

Yes, it's just a convention that -i/-image means "Just give me any kind of storage that is preferred". ATA is kind of deprecated because I see little use for it in context of a RISC-V system (Outside of Haiku bootloader, and even this is temporary), and because it isn't maintained well. It's not like I'm against this device, but no one is gonna implement missing features / non-critical fixes for it any more. I only ran a bit of fuzzing/coverage because I don't want to put my users under security risk for using it, and because someone had to do it.

I have no plans for more storage devices currently. That's why I don't know what should I do with the CLI interface, really.

X547 commented 1 year ago

Did you solve a problem of Haiku build? What Haiku source version are you using? What happens if run jam @minimum-raw kernel?

LekKit commented 1 year ago

Did you solve a problem of Haiku build? What Haiku source version are you using? What happens if run jam @minimum-raw kernel?

Using your Haiku fork, rvvm2 branch Figured the jam issue, thanks. There are some compilation errors tho

../src/system/boot/platform/efi/arch/riscv64/arch_traps.cpp: In function 'void WriteSstatus(uint64_t)':
../src/system/boot/platform/efi/arch/riscv64/arch_traps.cpp:50:30: error: no matching function for call to 'SstatusReg::SstatusReg(uint64_t&)'
   50 |         SstatusReg status(val);
      |                              ^
X547 commented 1 year ago

Figured the jam issue, thanks. There are some compilation errors tho

Fixed, source updated.

X547 commented 1 year ago

Functional configuration:

LekKit commented 1 year ago

Functional configuration:

Hurray, I now can at least try it since we have working i2c hid and stuff... Feels great (Tho perf could be better, I'm currently losing to QEMU here perhaps. Haiku uses floats a lot, right?)

Will proceed to NVMe bootloader driver

LekKit commented 1 year ago

Hmm, sometimes I2C HID deadlocks apparently. Pretty rare to spot but I've seen these 2 times already.

WARN: Possible deadlock at src/devices/hid-mouse.c@97
WARN: The lock was previously held at src/devices/hid-mouse.c@213
WARN: Version: RVVM v0.5-8e8f200-git
WARN: Attempting to recover execution...
 * * * * * * *

WARN: Possible deadlock at src/devices/i2c-hid.c@155
WARN: The lock was previously held at src/devices/i2c-hid.c@318
WARN: Version: RVVM v0.5-8e8f200-git
WARN: Attempting to recover execution...
 * * * * * * *
LekKit commented 1 year ago

Verify other utility devices (RTC, syscon) work. Are goldfish or DS1742 RTCs supported in Haiku?

Syscon doesn't seem to work. Powering off the system from the guest leaves me with some win2000-vibe message "It's now safe to turn off the computer" and the machine never actually powers down.

Should be trivial to implement, syscon is just a single mmio register with specific values for poweroff/reset. This is also used in QEMU and on SiFive boards AFAIK.

X547 commented 1 year ago

Haiku currently support shutdown and RTC with HTIF commands. RTC HTIF interface is my extension and it work only in my TinyEMU fork.

LekKit commented 1 year ago

Haiku currently support shutdown and RTC with HTIF commands. RTC HTIF interface is my extension and it work only in my TinyEMU fork.

I can implement that as well probably?

X547 commented 1 year ago

I can implement that as well probably?

I think that it is better to implement more standard interfaces.

HTIF commands currently used by Haiku:

Executing HTIF command:

// host-target interface
struct HtifRegs
{
    uint32 toHostLo;
    uint32 toHostHi;
    uint32 fromHostLo;
    uint32 fromHostHi;
};

uint64
HtifCmd(uint32 device, uint8 cmd, uint32 arg)
{
    if (gHtifRegs == 0)
        return 0;

    uint64 htifTohost = ((uint64)device << 56)
        + ((uint64)cmd << 48) + arg;
    gHtifRegs->toHostLo = htifTohost % ((uint64)1 << 32);
    gHtifRegs->toHostHi = htifTohost / ((uint64)1 << 32);
    return (uint64)gHtifRegs->fromHostLo
        + ((uint64)gHtifRegs->fromHostHi << 32);
}

FDT compatible value: ucb,htif0.

LekKit commented 1 year ago

I think that it is better to implement more standard interfaces.

Yeah, syscon and goldfish rtc were implemented just because they match basic QEMU machine. I would prefer emulating hardware from real RV boards (current PLIC/CLINT/UART/I2C-OC/NVMe fall into this category well) or just some generic common hardware (simple-fb perhaps counts?.. My RPi uses this driver as well). As I see HTIF is a basic interface for debugging FPGA boards, right? That's great if it's part of official spec, I've just never seen it for some reason.

LekKit commented 1 year ago

Haiku dd reports incorrect transfer speed (0.0 or -0.0!), crashes the kernel. No meaningful backtrace, perhaps I should build Haiku with -fno-omit-frame-pointer. As for "does it happen outside RVVM" I dunno, should be checked soon. Values -0.0 suspiciously look like some FPU-related trouble. Don't mind the terrible read speeds, my host laptop HDD is really that slow

vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x303231353237313e, ip 0x3eabcd2226, write 0, user 1, exec 0, thread 0x1a8
PANIC: thread_hit_serious_debug_event
Welcome to Kernel Debugging Land...
Thread 424 "dd" running on CPU 0
Stack:
FP: 0x0
kdebug> bt
Stack:
FP: 0xffffffc006664870
FP: 0xffffffc005c58298, PC: 0xffffffc0000ca8a6 <kernel_riscv64> invoke_debugger_command.localalias + 170
FP: 0x0, PC: 0x100000040 0x100000040
kdebug> 

image image

X547 commented 1 year ago

PANIC: thread_hit_serious_debug_event

This is userland process crash. It can be continued with es command. This kernel panic usually don't happen. It is temporary added here (https://github.com/X547/haiku/blob/dc7f6642ec87bfaa4d9731a91e57fdbf0c0e7cc0/src/system/kernel/debug/user_debugger.cpp#L816) for debugging purposes.

dd seems miscompiled. Crash happends on all RISC-V platforms (all supported emulators and real hardware).

LekKit commented 1 year ago

dd seems miscompiled. Crash happends on all RISC-V platforms (all supported emulators and real hardware).

Hmm, alright then. The 0.0/-0.0 values too?

X547 commented 1 year ago

The 0.0/-0.0 values too?

It is probably related to crash cause. Need to check dd source code (it is a part of GNU coreutils).

LekKit commented 1 year ago

Interesting note: Running without RVJIT doesn't affect guest Haiku responsiveness / CPU consumption much. I'm pretty surprised, because it feels super smooth even though RVVM interpreter should emulate CPU with speeds about 300-500 MHz on my host.

Am I right that Haiku uses floats extensively? Currently any FPU code forces RVVM to cross-jump between JIT/interpreter, which is not ideal. Perhaps with further FPU JIT support this could be significantly faster.

X547 commented 1 year ago

Am I right that Haiku uses floats extensively?

Yes, GUI API use float type for coordinates. Also currently kernel do aggressive TLB update.

LekKit commented 1 year ago

Yes, GUI API use float type for coordinates. Also currently kernel do aggressive TLB update.

Aggressive TLB update also could degrade JIT perf because it has to exit to interpreter to refill it, ASID support should improve that. I'm just collecting clues for further perf improvements because there are never enough directions to look at. Thanks)

X547 commented 1 year ago

Aggressive TLB update also could degrade JIT perf because it has to exit to interpreter to refill it, ASID support should improve that.

It is originally made to handle potential hardware bugs and strange memory corruptions. Maybe with RVVM TLB update can be less aggressive (do RVVM support page range update? ASID?).

LekKit commented 1 year ago

do RVVM support page range update?

Yes. ASID - not yet, but planned for v0.6 or sth around that.

X547 commented 1 year ago

Do you have some tools to profile RVVM bottlenecks when running Haiku (JIT->interpreter switches, TLB updates etc.)?

LekKit commented 1 year ago

Do you have some tools to profile RVVM bottlenecks when running Haiku (JIT->interpreter switches, TLB updates etc.)?

I usually use HotSpot profiler, which is Linux-only. It uses perf kernel subsystem to profile without overhead, then just look at call graph / hot places, etc (I.e. percentage in JITed code, percentage in interpreter/MMU/devices) No idea if there is anything similar for Haiku

X547 commented 1 year ago

No idea if there is anything similar for Haiku

I mean run profiler on Linux host and Haiku guest.

LekKit commented 1 year ago

No idea if there is anything similar for Haiku

I mean run profiler on Linux host and Haiku guest.

I assumed your primary workstation is running Haiku since you appear to be using it very actively. If you can use Linux for this then no problem. Be aware that Haiku optimizations should in no way be tangled to specific RVVM inner details, just use common sense of "less is better", especially with sfence/zifence. I'm not sure that some RVVM implementation details are even meaningful to optimize for. There are also cases of disproportionate penalties, i.e. zifence was recently very catastrophic for perf and it is still not perfect, MMIO is very slow compared to real hardware, etc.

Things that should be improved on RVVM side instead:

X547 commented 1 year ago

Be aware that Haiku optimizations should in no way be tangled to specific RVVM inner details, just use common sense of "less is better", especially with sfence/zifence.

I do not plan to do some sophisticated optimizations now. Haiku RISC-V main tasks for now is fixing bugs (like random EFI boot loader crashes, also appear on real hardware), improve correctness and more hardware support. It is not as mature as x86 for now.

LekKit commented 1 year ago

How do I enable building NVMe driver for haiku_loader.riscv?

X547 commented 1 year ago

How do I enable building NVMe driver for haiku_loader.riscv?

Add it to Jamfile in the same directory as *.cpp file.

X547 commented 1 year ago

https://github.com/X547/haiku/commit/fc63ba2491826454999e15f3a1286dd00580dbf5

Added Goldfish RTC driver and experimental kernel dynamically-installable RTC driver API (also for interrupt controllers).

Original Haiku design is x86-centered and some core IBM PC device drivers are statically linked to kernel and have no dynamic detection mechanism (arch_ functions).

screenshot507

LekKit commented 1 year ago

Possibly helpful: Improved CMake support in a2ddd5c to match any warnings & compilation options with Make, also tracks git commit with version

X547 commented 1 year ago

For shutdown and reboot it should be 2 separate devices?

LekKit commented 1 year ago

For shutdown and reboot it should be 2 separate devices?

There is a single device which is weirdly serialized in FDT. It's a single MMIO register with 2 possible values to be written for reset/poweroff, and the possible values are using separate FDT nodes.

See https://github.com/LekKit/RVVM/blob/0d0b23128c9fee78f2b3cf1273f1303adc8cdf7a/src/devices/syscon.c#L69 https://github.com/LekKit/RVVM/blob/0d0b23128c9fee78f2b3cf1273f1303adc8cdf7a/src/devices/syscon.c#L26

Overall syscon is supported almost everywhere (OpenSBI, U-Boot, Linux, BSDs) since it's pretty simple and somewhat standard for QEMU and some physical boards.

LekKit commented 1 year ago

Got NVMe driver to work with haiku_loader.riscv, although it's crashing the kernel later on. Seems like NvmeBlockDevice destructor is never called, but I need to properly reset NVMe before giving it off to the kernel (Why the kernel doesn't do that, I dunno). image

X547 commented 1 year ago

Got NVMe driver to work with haiku_loader.riscv, although it's crashing the kernel later on.

Can you publish your code somewhere (GitHub branch etc.)?

LekKit commented 1 year ago

Got NVMe driver to work with haiku_loader.riscv, although it's crashing the kernel later on.

Can you publish your code somewhere (GitHub branch etc.)?

See https://github.com/LekKit/haiku/commit/047a40a703224e679454601defbf5386c77f2a68 Beware that this isn't very clean, most likely violates some Haiku codestyle guidelines. I also could've missed something and hence why the kernel crashes.

LekKit commented 1 year ago

I should also test this in QEMU because I've no guarantees that RVVM NVMe emulation is perfect. Will report afterwards. Temporary upd: Blows up in QEMU weirdly. Perhaps the device address is wrong.

NvmeBlockDevice::Init()
  fRegs->cap1: 0xffffffff
  fRegs->cap2: 0xffffffff
  fRegs->version: 0xffffffff
  fRegs->adminSubmQueue: 0xffffffffffffffff
  fRegs->adminComplQueue: 0xffffffffffffffff
  fRegs->adminQueueAttrs: 65535, 65535

*death*
X547 commented 1 year ago

Temporary upd: Blows up in QEMU weirdly. Perhaps the device address is wrong.

fRegs = (volatile NvmeRegs*)(0x40000000);

QEmu also do not allocate PCI MMIO ranges so boot loader PCI initialization code should be used (https://github.com/LekKit/haiku/blob/rvvm2/src/system/boot/platform/riscv/pci.cpp).