agra-uni-bremen / riscv-vp

RISC-V Virtual Prototype
MIT License
139 stars 49 forks source link

Broken printf() for 'float' and 'double' types #14

Closed CA92697 closed 2 years ago

CA92697 commented 3 years ago

It appears that printf() is broken for 'float' and 'double' types. Printing the value 1.2 gives me 2.000000 or 0.000000, and 3.1415 gives me -2.000000 and -610.304199, for 'float' and 'double' types, respectively. Interestingly 'long double' appears to work fine.

I'm attaching a simple example derived from the provided printf/main.c eample: basic-float.tar.gz

Here, 'make sim-printf' will print the following: printf(%d, 12) 12 printf(%f, 1.2f) 2.000000 printf(%f, 1.2) 0.000000 printf(%Lf, 1.2) 1.200000 printf(%f, f) 2.000000 printf(%f, d) 0.000000 printf(%Lf, ld) 1.200000 printf(%f, f) -2.000000 printf(%f, d) -610.304199 printf(%Lf, ld) 3.141500

whereas it should print this: printf(%d, 12) 12 printf(%f, 1.2f) 1.200000 printf(%f, 1.2) 1.200000 printf(%Lf, 1.2) 1.200000 printf(%f, f) 1.200000 printf(%f, d) 1.200000 printf(%Lf, ld) 1.200000 printf(%f, f) 3.141500 printf(%f, d) 3.141500 printf(%Lf, ld) 3.141500

The latter is seen when compiled natively by 'make reference'.

The authors are aware of this issue, but I post it here openly for discussion and/or resolution. Thanks!

Cirromulus commented 3 years ago

It seems that the libc library uses syscalls for floats only. It should work with the additional parameter --intercept-syscalls. I'll keep this open for now, until someone gets an explanation fot that behavior.

CA92697 commented 3 years ago

I'm getting the very same output with and without the --intercept-syscalls option. I'm not sure if syscalls have any influence on this. It was my understanding that printf works entirely in user-space until the complete string is composed which then is output to stdout via a final syscall. I will investigate this further as soon as time permits. Thanks!

CA92697 commented 3 years ago

OK, we made 2 observations that together triggered an idea that seems to be the actual problem. Observation 1 is the post above about the related issue #15 . Observation 2 is that in my other application (which I can't share here) 'float' types behave totally fine, their arithmetic gives correct results, function calls with them work fine, etc. The only exception is that printing them doesn't work (as demonstrated above).

Now, we all know that printf is actually a special function in the sense that it handles variable length argument lists. I bet that's what's broken here!

I'm attaching an example that demonstrates the problem clearly: variable_arguments.tar.gz

When executed on my RV32, it results in the following output: Char:1 Average of 2, 3, 4, 5 = 3 Char:1 Average of 5, 10, 15 = 10 Shrt:2 Average of 2, 3, 4, 5 = 3 Shrt:2 Average of 5, 10, 15 = 10 Int: 4 Average of 2, 3, 4, 5 = 3 Int: 4 Average of 5, 10, 15 = 10 Long:4 Average of 2, 3, 4, 5 = 3 Long:4 Average of 5, 10, 15 = 10 LL: 8 Average of 2, 3, 4, 5 = -2147439964 LL: 8 Average of 5, 10, 15 = 58245 F: 4 Average of 2, 3, 4, 5 = 0 F: 4 Average of 5, 10, 15 = 0 D: 8 Average of 2, 3, 4, 5 = 0 D: 8 Average of 5, 10, 15 = 0 LD: 16 Average of 2, 3, 4, 5 = 3 LD: 16 Average of 5, 10, 15 = 10

The expectation here is that we always get 3 and 10 as the results of the truncated average value. Clearly, this works for most types, but not for 'long long', 'float', and 'double'!

I haven't debugged this any further, but it's clear to me that the functionality of stdarg.h is broken (which in turn breaks printf and friends). I highly suspect that this has to do with the stack alignment where GCC, the libgcc, and/or RISC-V VP have a disagreement.

I'll leave this for the RISC-V VP experts to look into... :-) Thanks!!

DamienLeBars commented 2 years ago

Hello, i had quite the same problem. I think the stdarg is maybe based on the specification that the sp register is 16bytes aligned and not 4 bytes aligned like what is done here: https://github.com/agra-uni-bremen/riscv-vp/blob/9d6b9571d57efedce452005dab7cb4f48f42d69a/vp/src/platform/basic/main.cpp#L163

Cirromulus commented 2 years ago

Thats a good comment, we will investigate that.

ucirvinelecs commented 2 years ago

Yes, excellent observation and suggestion! I can confirm that both my test examples work fine when we change that line, as follows:

core.init(instr_mem_if, data_mem_if, &clint, entry_point, opt.mem_end_addr & 0xfffffff0);

Properly aligning the stack to the needs of the hardware makes a lot of sense. Whether or not 16 is the right number, I honestly don't know. This should be easy to look-up for the specific RISC-V architecture.

I also don't know whether or not the stack pointer should should be initialized to the last word inside of the allocated stack space (which is done in both cases above), or it can actually be initialized to the address just above the end (mem_end_addr = mem_start_addr + mem_size). The latter would work just fine if the stack pointer is decremented before storing anything on the stack, which might or might not be the calling convention here.

On a 2nd look, the XXX_end_addr values in platform/basic/main.cpp appear inconsistent in that aspect. Some values point to the last byte inside a range, some point to just beyond the range. The consistency could be improved here (and probably can prevent similar alignment problems as well).

Either way, many thanks to DamienLeBars for finding this fix!

Cirromulus commented 2 years ago

Ok, I looked into it. The RISCV non-isa specification is unclear about the alignedness of the stack, but defaults to 32 bit in most RV32 variants (https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-elf.adoc).

It "works" when aligning the stack to 64 bit. This is probably more due to our syscall handler, as most of our host-processors will be 64 bit aligned, and we do not do any relocations if not aligned properly. It was never designed to be used in "production", as this emulates an OS where there is none. No (real) processor would handle syscalls in hardware like this. It is just an easy start to test things.

That being said, I changed it to 64 bit in the "basic" VP, as it is the main starting point for newcomers and 64 bit alignedness is a subset of 32 bit. (see 7362196e618ca2554571d5e7df3b669b5a0e45bf).