enjoy-digital / litex

Build your hardware, easily!
Other
2.86k stars 550 forks source link

Unexpected behaviour of printf #1879

Open JamesTimothyMeech opened 7 months ago

JamesTimothyMeech commented 7 months ago

I generated a litex SoC and deployed it to a Digilent Arty using this command:

python3 -m litex_boards.targets.digilent_arty --bios-format float  --cpu-type femtorv --cpu-variant gracilis --variant a7-100 --toolchain vivado --with-spi-sdcard --sdcard-adapter digilent --timer-uptime --build --load

I then used LiteOs to compile a simple C program to print a float: https://github.com/BrunoLevy/learn-fpga/tree/master/LiteX/software/LiteOS

The program successfully compiles and runs but it prints the incorrect value:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char** argv) {
   float f = 1.5;
   printf("%f\n",f);
   return 0;
}
liteOS> run float.elf
8589869000.000000

Does anyone have any advice about the best way to debug this?

alanvgreen commented 7 months ago

Hi James,

Welcome to the wonderful world of custom CPUs and FPGAs. As you work through bugs like this, you'll learn a LOT about how programs work.

In this particular case, I'd try several things:

Also, your program doesn't have any BSS storage. Try defining

I've seen cases where crt0.s has a bug if one of BSS or Data is empty.

alanvgreen commented 7 months ago

I should add - using https://www.h-schmidt.net/FloatConverter/IEEE754.html you can see that the hex representation of 8589869000.000000 is 0x4fffff80, while 1.5 is 0x3fc00000.

JamesTimothyMeech commented 7 months ago

Thank for the tips! I'll document my attempts to debug using your tips here. I tried using a different CPU and recompiled LiteOS and my program:

python3 -m litex_boards.targets.digilent_arty --bios-format float --cpu-type vexriscv --variant a7-100 --toolchain vivado --with-spi-sdcard --sdcard-adapter digilent --timer-uptime --build --load

I got the same result:

 liteOS> run float.elf
8589869000.000000

I'll try the other suggestions now!

JamesTimothyMeech commented 7 months ago

Adding the global variables you mentioned don't seem to help but maybe my crt0.s is missing something important! Adding int bss_var;

#include <stdio.h>
#include <stdlib.h>

int bss_var;

int main(int argc, char** argv) {
   float f = 1.5;
   printf("%f\n",f);
   return 0;
}

Adding int data_var = 5;

#include <stdio.h>
#include <stdlib.h>

int data_var = 5;

int main(int argc, char** argv) {
   float f = 1.5;
   printf("%f\n",f);
   return 0;
}

Adding both

#include <stdio.h>
#include <stdlib.h>

int bss_var;
int data_var = 5;

int main(int argc, char** argv) {
   float f = 1.5;
   printf("%f\n",f);
   return 0;
}
liteOS> run float_bss.elf
8589869000.000000
liteOS> run float_data.elf
8589869000.000000
liteOS> run float_bss_data.elf
8589869000.000000
JamesTimothyMeech commented 7 months ago

I should add - using https://www.h-schmidt.net/FloatConverter/IEEE754.html you can see that the hex representation of 8589869000.000000 is 0x4fffff80, while 1.5 is 0x3fc00000.

When I run this program:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char** argv) {
   float f = 1.5;
   union {
        float f;
        uint32_t u;
    } f2u = { .f = f };

   //printf ("float : %f\n", f);
   printf ("hex : 0x%0lx\n", f2u.u);
   return 0;
}

I get this result which is correct:

liteOS> run float_hex.elf
hex : 0x3fc00000

I'll dig into the crt0.S tomorrow to see if anything important is missing:

// crt0.S for executables
// interrupts and stack are already configured by OS
// _start does the following tasks:
//  1) save registers (ra, t0..t6, a0..a7)
//  2) initialize BSS
//  3) call main
//  4) restore registers
//  5) return to caller (LiteOS shell)

        .global _start
_start:
        // save context
    addi sp, sp, -16*4
    sw ra,  0*4(sp)
    sw t0,  1*4(sp)
    sw t1,  2*4(sp)
    sw t2,  3*4(sp)
    sw a0,  4*4(sp)
    sw a1,  5*4(sp)
    sw a2,  6*4(sp)
    sw a3,  7*4(sp)
    sw a4,  8*4(sp)
    sw a5,  9*4(sp)
    sw a6, 10*4(sp)
    sw a7, 11*4(sp)
    sw t3, 12*4(sp)
    sw t4, 13*4(sp)
    sw t5, 14*4(sp)
    sw t6, 15*4(sp)

    // initialize .bss
    la t0, _fbss
    la t1, _ebss
1:  beq t0, t1, 3f
    sw zero, 0(t0)
    addi t0, t0, 4
    j 1b
3:

        call main

    // restore context
    lw ra,  0*4(sp)
    lw t0,  1*4(sp)
    lw t1,  2*4(sp)
    lw t2,  3*4(sp)
    lw a0,  4*4(sp)
    lw a1,  5*4(sp)
    lw a2,  6*4(sp)
    lw a3,  7*4(sp)
    lw a4,  8*4(sp)
    lw a5,  9*4(sp)
    lw a6, 10*4(sp)
    lw a7, 11*4(sp)
    lw t3, 12*4(sp)
    lw t4, 13*4(sp)
    lw t5, 14*4(sp)
    lw t6, 15*4(sp)
    addi sp, sp, 16*4

    ret
AndrewD commented 7 months ago

The issue is most likely in printf(), particularly if it has some conditional compilation features, but the other vague possibility is the float to double conversion when calling printf.

AndrewD commented 7 months ago

I just had a brief look at picolibc: Maybe add an experiment try --bios-format double.

JamesTimothyMeech commented 7 months ago

Ah thanks I should have known to try that! Running:

python3 -m litex_boards.targets.digilent_arty --bios-format double --cpu-type femtorv --cpu-variant gracilis --variant a7-100 --toolchain vivado --with-spi-sdcard --sdcard-adapter digilent --timer-uptime --build --load

and recompiling LiteOS and my program produced the expected result:

liteOS> run float.elf
1.500000
AndrewD commented 7 months ago

For the float option you probably need a gcc flag too. --float-double or something like that