jcmvbkbc / gcc-xtensa

gcc for xtensa
GNU General Public License v2.0
131 stars 58 forks source link

Xtensa ELF info/hints? #3

Closed pfalcon closed 9 years ago

pfalcon commented 9 years ago

Another support request regarding Xtensa stuff:

  1. Is there formal Xtensa ELF ABI references, which described what R_XTENSA_SLOT0_OP and friends are? I saw such stuff e.g. for PowerPC, but googling for "R_XTENSA_SLOT0_OP pdf" gives nothing, and for "R_XTENSA_SLOT0_OP" only noise.
  2. Does Xtensa arch support linker-relocated, non-PIC shared libraries? E.g. old good x86 supports that, while x86_64 explicitly don't. Quick try for Xtensa gives: "dangerous relocation: invalid relocation for dynamic symbol: memset", "dangerous relocation: dynamic relocation in read-only section", etc. I still wonder if there's a definitive, formal answer.

Thanks.

Context: well, if you make things like https://github.com/jcmvbkbc/esp-elf-rom yourself, you shouldn't be surprised someone else asks such questions ;-). And did a "@jcmvbkbc" in another project's ticket, so just leaving it here: https://github.com/pfalcon/ScratchABit

jcmvbkbc commented 9 years ago

Is there formal Xtensa ELF ABI references, which described what R_XTENSA_SLOT0_OP and friends are?

No, AFAIK. The best description I know of is in the binutils source: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=bfd/bfd-in2.h;h=ade49ffc6188210ad2d6484c154853eb6c75613e;hb=HEAD#l5359 and https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=bfd/elf32-xtensa.c;h=25236707dae46e7190c646de1601fb1f6ff088fc;hb=HEAD#l165 Some notes on TLS-specific relocations are here: http://wiki.linux-xtensa.org/index.php/ABI_Interface I guess I'll spend some time this year developing xtensa support bits for elfutils, looks like it'd be a good time to document these pieces of ABI.

Does Xtensa arch support linker-relocated, non-PIC shared libraries?

No, AFAIK. Can you give an example of such library, I'm curious how linking command looks for it? OTOH there's overlay support in the xtensa tools, but I don't know anything about it.

if you make things like https://github.com/jcmvbkbc/esp-elf-rom yourself, you shouldn't be surprised someone else asks such questions ;-)

I'm not surprised at all, but that reference doesn't explain much. esp-elf-rom is made to ease debugging with gdb. But from what you're saying it looks like you're developing dynamic loader, right?

And did a "@jcmvbkbc" in another project's ticket, so just leaving it here:

-ENOPARSE. Can't find anything related by your link.

pfalcon commented 9 years ago

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=bfd/elf32-xtensa.c;h=25236707dae46e7190c646de1601fb1f6ff088fc;hb=HEAD#l165

Thanks. So, does R_XTENSA_ASM_EXPAND's (https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=bfd/elf32-xtensa.c;h=25236707dae46e7190c646de1601fb1f6ff088fc;hb=HEAD#l1965) purpose for example to only serve as a place of linker to check, not really change instruction's args? Also, is meaning of R_XTENSA_NONE "there was a relocation needed, but now it's done somehow" or "void entry, don't assume there was a relocation needed at all"? (See below for argumentation.)

No, AFAIK. Can you give an example of such library, I'm curious how linking command looks for it? OTOH there's overlay support in the xtensa tools, but I don't know anything about it.

This gives an example: http://stackoverflow.com/a/6570000/496009 . Again, only few archs support relocatable (vs PIC) shlibs, like i386.

I'm not surprised at all, but that reference doesn't explain much. esp-elf-rom is made to ease debugging with gdb. But from what you're saying it looks like you're developing dynamic loader, right?

Well, so I'm looking for a way to automatically tell which instruction operands are addresses and which are not. One way to do that is by using relocs. At the same time, I need the code to be linked already (all xref's resolved, and all addresses are in the code). That's done by applying relocations, and they're no longer needed after that and discarded. So, I was looking for a way to get both ;-). ld -r doesn't work as it explicitly produces an object, not executable file, and then 2nd idea was to cheat by producing shlib instead of executable. That doesn't appear work, so looks like I'll need to write a kind of linker ;-).

-ENOPARSE. Can't find anything related by your link.

It was this: https://github.com/tommie/lx106-hal/issues/1#issuecomment-96367093

jcmvbkbc commented 9 years ago

does R_XTENSA_ASM_EXPAND's purpose for example to only serve as a place of linker to check, not really change instruction's args?

Yes, it marks the places for link-time relaxation.

Also, is meaning of R_XTENSA_NONE "there was a relocation needed, but now it's done somehow" or "void entry, don't assume there was a relocation needed at all"? (See below for argumentation.)

I think R_XTENSA_NONE should never appear in objects/executables. If it does it's most likely a bug.

Well, so I'm looking for a way to automatically tell which instruction operands are addresses and which are not. One way to do that is by using relocs.

Not sure I understand. The instruction defines how its operand is used, e.g. in l32r a0, x x is always an address. You probably care if the value loaded from x is an address, right?

If so then I don't see why having PIC shared object is bad: addresses will anyway be represented as literals with relocations against them, and when you disassemble an instruction you'd be able to see that it refers to such literal.

If for some other reason GOT and PLT need to be avoided it still may be easier to relax ld restrictions on relocation placement and allow leaving R_XTENSA_SLOT*_OP type relocations in the linked shared object. One of the reasons it's not allowed now is that these relocation types don't describe relocation completely, the instruction where relocation points must be analyzed in order to understand, how its immediate subfield must be changed. That'd be very expensive for dynamic linker, but doesn't matter for static analysis.

pfalcon commented 9 years ago

The instruction defines how its operand is used, e.g. in l32r a0, x x is always an address.

Well, yeah, the beauty of the RISC. But that's not true in general case, e.g. if something is linked at address 0, N in "movi aX, N" can be either literal numeric value or address. For arch where "move immediate" is full-range, or for RISCs, which emulate it with l32r-like, the issue is also apparent.

I think R_XTENSA_NONE should never appear in objects/executables. If it does it's most likely a bug.

In an object file produced by "ld -r"ing together all objects from exploded esp8266 sdk libs:

$ readelf --all blob.o | grep R_XTENSA_NONE | wc
   8167   32668  413452

And generally, if those mark place which was already fixed up (e.g. SLOT0_OP which was undefined in a single object, but which was fixed up with relative addressing), it's better to have (for my usecase) at least NONE, than nothing at all.

If so then I don't see why having PIC shared object is bad

It's not bad. The question was whether non-PIC objects can be put a shared lib: I just took an esp8266 which produces ELF (from which actual ROM image is to be extracted), and added --shared option, leading to bunch of errors quoted above, so I just wondered if something could be done about that, but I assume not.

From Linux point of view, requiring shlib to be always PIC makes good sense, given that it simplifies dynamic linker and gives 100% sharable image w/o need for pages dirtied by relocations.

Well, thanks for discussion, it was helpful, as I mentioned, I started writing kind of load-linker for scratchabit, even if it will be just proof of concept.

jcmvbkbc commented 9 years ago

I think R_XTENSA_NONE should never appear in objects/executables. If it does it's most likely a bug.

In an object file produced by "ld -r"ing together all objects from exploded esp8266 sdk libs

Interesting. I looked at the produced object file and saw that

     ee0:       f0c112          addi    a1, a1, -16
                        ee0: R_XTENSA_NONE      *ABS*
    101f:       0074c5          call0   176c <system_rtc_mem_read>
                        101f: R_XTENSA_NONE     *ABS*+0xa8
                        101f: R_XTENSA_SLOT0_OP system_rtc_mem_read

I still think that these are bugs.

BTW, have you tried linker options

`-q'
`--emit-relocs'
     Leave relocation sections and contents in fully linked executables.
     Post link analysis and optimization tools may need this
     information in order to perform correct modifications of
     executables.  This results in larger executables.
pfalcon commented 9 years ago

--emit-relocs

Great, exactly what I need! I tried to look thru ld --help, but apparently quit that too early switching to google instead. Thanks for the hint!

pfalcon commented 9 years ago

Another question, not directly related to the above, but to not create another ticket:

Reading Xtensa ISA RefMan, s.8.3.1:

The assembler substitutes a different instruction when an operand is out of range. For example, it turns MOVI into L32R when the immediate is outside the range -2048 to 2047.

Suppose I want to perform reverse transform - turn L32R into MOVI, but want to make it distinguishable from real MOVI - what naming would you suggest? So far I use "movi*", but maybe some form would be more "Xtensa-ic", e.g. "movi.l"?

jcmvbkbc commented 9 years ago

Suppose I want to perform reverse transform - turn L32R into MOVI, but want to make it distinguishable from real MOVI - what naming would you suggest?

Make it distinguishable in what context? You mean disassembling l32r into movi? Don't know. To my taste literal disassembly with loaded value in comment is the best.

maybe some form would be more "Xtensa-ic"

No, AFAIK: we only make opcode substitution at assembly time, not at disassembly. And if you write in assembly you usually just use movi regardless of the immediate value.

pfalcon commented 9 years ago

Make it distinguishable in what context? You mean disassembling l32r into movi? Don't know. To my taste literal disassembly with loaded value in comment is the best.

Yes, in the context of producing human-readable disassembly (which is a context of ScratchABit mentioned above). You prefer that because you use Xtensa asm daily, for other people it's nuisance to remember difference between l32i & l32r ;-). Also, comments are just that - sequence of chars, while arguments are objects and have type (numeric value/address at least). So, in a current prototype of this feature for ida-xtensa I have argument vs comment the other way around:

4000011f   movi*           a2, 0x4000e328 ; via 0x40000098

4000011f   movi*           a2, _rom_store_table ; via 0x40000098 

So, if you don't have better suggestions than "movi*", let it stay that ;-).