riscv-collab / riscv-gnu-toolchain

GNU toolchain for RISC-V, including GCC
Other
3.58k stars 1.17k forks source link

linker fails to link opencv (relocation truncated to fit: R_RISCV_PCREL_HI20 against `.LC8') #624

Closed kraj closed 7 months ago

kraj commented 4 years ago

When compiling openCV with latest gcc 9.x it ends up with linking errors

| modules/datasets/CMakeFiles/opencv_datasets.dir/src/tinyxml2/tinyxml2.cpp.o: in function `tinyxml2::XMLUtil::StringEqual(char const*, char const*, int)':
| /usr/src/debug/opencv/4.1.0-r0/contrib/modules/datasets/src/tinyxml2/./tinyxml2.h:526:(.text._ZN8tinyxml27XMLUtil6ToBoolEPKcPb+0x82): relocation truncated to fit: R_RISCV_PCREL_HI20 against `.LC8'
| collect2: error: ld returned 1 exit status

surprisingly, this works ok when compiling with clang.

Below tarball contains the needed objects it can be reproduced with

% /mnt/b/yoe/build/tmp/work/riscv64-yoe-linux/opencv/4.1.0-r0/recipe-sysroot-native/usr/bin/riscv64-yoe-linux/riscv64-yoe-linux-ld -L./ *.o -shared -o a.so
tinyxml2.cpp.o: in function `tinyxml2::XMLUtil::StringEqual(char const*, char const*, int)':
/usr/src/debug/opencv/4.1.0-r0/contrib/modules/datasets/src/tinyxml2/./tinyxml2.h:526:(.text._ZN8tinyxml27XMLUtil6ToBoolEPKcPb+0x82): relocation truncated to fit: R_RISCV_PCREL_HI20 against `.LC8'

The test files are here

Nelson1225 commented 4 years ago

Hi @kraj ,

I am no sure why Clang can pass, but I think I find the pattern which cause the truncated error. You can simply link the tinyxml2.cpp.o, and then get the same error,

$ riscv64-unknown-linux-gnu-ld tinyxml2.cpp.o -shared --no-relax riscv64-unknown-linux-gnu-ld: DWARF error: could not find variable specification at offset 4dc riscv64-unknown-linux-gnu-ld: DWARF error: could not find variable specification at offset 53a tinyxml2.cpp.o: in function tinyxml2::XMLUtil::StringEqual(char const*, char const*, int)': /usr/src/debug/opencv/4.1.0-r0/contrib/modules/datasets/src/tinyxml2/./tinyxml2.h:526:(.text._ZN8tinyxml27XMLUtil6ToBoolEPKcPb+0x86): relocation truncated to fit: R_RISCV_PCREL_HI20 against.LC8'

And you can use readelf to see the relocation, $ riscv64-unknown-linux-gnu-readelf -Wr tinyxml2.cpp.o Relocation section '.rela.text._ZN8tinyxml27XMLUtil6ToBoolEPKcPb' at offset 0xab4f8 contains 47 entries: Offset Info Type Symbol's Value Symbol's Name + Addend ... 0000000000000086 000019af00000017 R_RISCV_PCREL_HI20 0000000000000008 .LC8 + 7fffffff 0000000000000086 0000000000000033 R_RISCV_RELAX 7fffffff 000000000000008a 000019b100000018 R_RISCV_PCREL_LO12_I 0000000000000086 .L0 + 0 000000000000008a 0000000000000033 R_RISCV_RELAX 0 ...

The addend 0x7fffffff is too large, the PCREL can not cover it. From the GNU linker's perspective, I think this large addend doesn't make sense, but I believe there should be a reason that why compiler/assembler generates the pattern. So, if the tinyxml2.cpp.o generated by the source C code? or a hand written assembly code? It would be better to get more information from the source, and then we can see what's happens about this.

Thanks Nelson

kraj commented 4 years ago

@Nelson1225 I don't see any assembly in source here is lc8.c and you can compile it like this riscv64-yoe-linux-g++ lc8.c -c -fPIC -O2

If I don't use -O2 then the relocation does not have 7fffffff but 0 and if I don't use -fPIC then it's not even generated it does not happen with any other On level, I tried Os, O3, Ofast as well

jim-wilson commented 4 years ago

The xx.tar.xz file in the first comment doesn't work, as there are dependencies on missing shared libraries. But I can reproduce with the l8.c file in the third comment, renamed to l8.ii as it is preprocessed C++ code. I see in the asm file lla a1,.LC8+2147483647 In the C++ input file, the function ToBool does else if ( StringEqual( str, "false" ) ) { and the function StringEqual has

inline static bool StringEqual( const char* p, const char* q, int nChar=0x7fffffff ) {
        int n = 0;
...
        while( *p && *q && *p == *q && n<nChar ) {

so the loop optimizer is trying to optimize away the n variable by replacing it with *q < "false"+0x7fffffff which results in an address calculation .LC8+0x7fffffff that we can't do with an lla.

This is definitely a compiler bug. You can work around it by passing a string length to the StringEqual function, which is trivial to compute for the string "false". I just added ", strlen ("false")+1" to the call, though I don't know whether the +1 is required but I don't think it can hurt.

jim-wilson commented 4 years ago

riscv_symbolic_constant_p has

      return sext_hwi (INTVAL (offset), 32) == INTVAL (offset);

but it isn't clear how that could have ever worked.

rohan:2203$ cat tmp.c
char *sub (void) { return "false"+0x7fffffff; }
int main (void) { return sub () != 0; }
rohan:2204$ riscv64-unknown-linux-gnu-gcc tmp.c -mcmodel=medlow
/tmp/ccGCHqyt.o: in function `sub':
tmp.c:(.text+0x6): relocation truncated to fit: R_RISCV_HI20 against `.LC0'
collect2: error: ld returned 1 exit status
rohan:2205$ riscv64-unknown-linux-gnu-gcc tmp.c -mcmodel=medany
/tmp/ccKWvoGD.o: in function `.L0 ':
tmp.c:(.text+0x6): relocation truncated to fit: R_RISCV_PCREL_HI20 against `.LC0'
collect2: error: ld returned 1 exit status
rohan:2206$ riscv64-unknown-linux-gnu-gcc tmp.c -fPIC
/tmp/ccLL6L59.o: in function `.L0 ':
tmp.c:(.text+0x6): relocation truncated to fit: R_RISCV_PCREL_HI20 against `.LC0'
collect2: error: ld returned 1 exit status
rohan:2207$ 

The TLS case does seem to work though, as we split the constant out and add it in separately. I will have to look at this some more.

jim-wilson commented 4 years ago

My testcase does work for rv32 as addresses wrap around. It doesn't work for rv64 because lui/auipc sign extend, and address+offset crosses the boundary between 0x7fffffff and 0x80000000. But since we can't know the address at compile time, we can't know what offsets might be safe to use with the address. If we make the simplifying assumption that no object crosses this boundary, then we can limit offsets by the size of the object. Though I think that disallowing size+1 might be a problem, so maybe we need to assume that no object is allowed to end at 0x7fffffff. This is turning out to be more complicated to fix than I expected.

TommyMurphyTM1234 commented 2 years ago

Duplicate of https://github.com/riscv-collab/riscv-gnu-toolchain/issues/1102?

Kingxukai commented 1 year ago

Hey,I've encountered this problem recently.Have you ever resolved it? Actually,it does work for rv32 but rv64,and I've added the option -mcmodel=medany, however,it doesn't work. And I don't think there is any problem in my Linker Script.

gongtianle123 commented 8 months ago

Hey,I've encountered this problem recently.Have you ever resolved it? Actually,it does work for rv32 but rv64,and I've added the option -mcmodel=medany, however,it doesn't work. And I don't think there is any problem in my Linker Script.

i've encountered this problem too,Have you ever resolved it?

Kingxukai commented 8 months ago

Hey,I've encountered this problem recently.Have you ever resolved it? Actually,it does work for rv32 but rv64,and I've added the option -mcmodel=medany, however,it doesn't work. And I don't think there is any problem in my Linker Script.

i've encountered this problem too,Have you ever resolved it?

Hi there, I resolved it last year, and I finally added "mcmodel=medany," and it worked. Simply put, "mcmodel" is an instruction that guides the compiler on how to produce the access instructions for global variables. As you may know, the accessible address of a variable is PC-relative (PC ± 2G), so using "medany" instructs the compiler to generate Position Independent Code.

You might be confused about why your variables have exceeded 2G. That's because the .text is at 0x8000_0000, which is 2^32 = 2G. You can test this by modifying the number in the linker script, changing it to 0x4000_0000. It compiles successfully, but RISC-V64 won't actually run it.

TommyMurphyTM1234 commented 7 months ago

@Nelson1225 I don't see any assembly in source here is lc8.c and you can compile it like this riscv64-yoe-linux-g++ lc8.c -c -fPIC -O2

The xx.tar.xz file in the first comment doesn't work, as there are dependencies on missing shared libraries. But I can reproduce with the l8.c file in the third comment, renamed to l8.ii as it is preprocessed C++ code. I see in the asm file lla a1,.LC8+2147483647

Unfortunately, the link to that test case is no longer working so the specific issue originally raised cannot be reproduced.

But based on the most recent reply I think it's appropriate to close this issue now.