espressif / binutils-esp32ulp

Binutils fork with support for the ESP32 ULP co-processor
GNU General Public License v2.0
46 stars 18 forks source link

Relocation information not generated correctly #9

Closed 8785benjamin closed 6 years ago

8785benjamin commented 6 years ago

Hi I have discovered that esp32ulp-elf-as does not generate correct relocation information in the .o file in some particular cases. Eg: .macro test label: wait 1 jump label .endm .text test In such a situation, as will indicate that the relocation to be applied to jump label has to be applied on the first instruction (ie. wait 1). This is what esp32ulp-elf-readelf confirms:

 Offset         Info      Type           Val.-sym    Noms-symb + Addenda
**00000000**  00000101 R_ESP32ULP_RIMM16 00000000   .text + 0

Offset is 0 instead of 4 (bytes) so esp32ulp-elf-ld will apply the relocation to wait and not to jump and will corrupt the code. The same kind of issue will always happen in macro if the relocation is not the first instruction of the macro. Same will apply if we gather several instructions on the same line (a macro is handled this way): loop: wait 1; jump loop;

After many hours of investigation I have found that the problem is located in tc-esp32ulp.c in the function md_assemble(). It always applies the relocation to the offset 0 of the current fragment of code (a fragment of code is a set of instructions ; in this implementation a fragment of code is all the instructions that appear on the same line, meaning that a new fragment is generated on every new line). Of course this does not work with macro's neither since when we handle the instructions inside of the macro, we stay on the same line of the source code. Faulty code:

if (insn->reloc && insn->exp->symbol)
        {
            size = 4;
            //DEBUG_TRACE("insn->reloc && insn->exp->symbol BFD_ARELOC_ESP32ULP_PUSH size =%i, insn->exp->value=%i, insn->pcrel=%i, insn->reloc=%i\n", size, (unsigned int)insn->exp->value, insn->pcrel, insn->reloc);
            //char *prev_toP = toP - 2;
            //fix_new(frag_now, (prev_toP - frag_now->fr_literal), size, insn->exp->symbol, insn->exp->value, insn->pcrel, insn->reloc);
            // were - shift from current word... 
            fix_new(frag_now, 0, size, insn->exp->symbol, insn->exp->value, insn->pcrel, insn->reloc);     // THE MISTAKE IS HERE, 2ND PARAMETER SHALL NOT BE 0
        }
        else
        {
            //DEBUG_TRACE("md_number_to_chars insn->value =%08x\n", (unsigned int)insn->value);
            //md_number_to_chars(toP, insn->value, 2);
            //toP += 2;
            // TODO:DYA - this is main changes for 32 bit words!!!!! changes stop work
            md_number_to_chars(toP, insn->value, 4);   // THIS IS THE SECOND BUG, WE SHALL ONLY DUMP THE FIRST 4 BYTES OUTPUT BY PARSER (first insn, not the ones which follow)
            toP += 4;
}

Moreover there is another bug (which does not cause problem in practice) : the loop will dump to the instruction buffer all what is coming out of the parser, whereas we shall only put the 4 bytes of the instruction. In theory it causes a buffer overflow but by chance will not corrupt anything.

I propose the following implementation of the md_assemble:

void
md_assemble(char *line)
{
    char *toP ;
    int  insn_size;
    size_t len;
    static size_t buffer_len = 0;
    static char *current_inputline;
    parse_state state;

    len = strlen(line);
    if (len + 2 > buffer_len)
    {
        buffer_len = len + 40;
        current_inputline = XRESIZEVEC(char, current_inputline, buffer_len);
    }
    memcpy(current_inputline, line, len);
    current_inputline[len] = ';';
    current_inputline[len + 1] = '\0';

    // run the parser on the instruction
    //   it will return a list of chained "insn",
    //   the first contains the opcode of the instruction
    //   and may be followed by other "insn" like an address
    state = parse(current_inputline);
    if (state == NO_INSN_GENERATED || !insn)
        return;
    // add 4 bytes to the fragment code buffer to put the new instruction
    // and get buffer pointer (toP) on where to write the instruction
    insn_size = 4;
    toP = frag_more(insn_size);         
#ifdef DEBUG
    printf("INS: %s\n", line);
#endif
    md_number_to_chars(toP, insn->value, insn_size);    // put the 4-byte instruction into the current fragment code buffer 
    // loop over the output of the parser to see if there is a relocation to be performed
    while (insn)
    {
        if (insn->reloc && insn->exp->symbol)
        {
            //DEBUG_TRACE("insn->reloc && insn->exp->symbol BFD_ARELOC_ESP32ULP_PUSH size =%i, insn->exp->value=%i, insn->pcrel=%i, insn->reloc=%i\n", size, (unsigned int)insn->exp->value, insn->pcrel, insn->reloc);
            // generate a relocation request for this instruction so that linker will put the right address
            //   toP is the pointer on this instruction in the buffer of the current code fragment 
            //   frag_now->fr_literal is the pointer on the begining of the buffer of the current code fragment
            fix_new(frag_now, toP - frag_now->fr_literal, insn_size, insn->exp->symbol, insn->exp->value, insn->pcrel, insn->reloc);
        }
#ifdef DEBUG
        //DEBUG_TRACE(" reloc : value = %08x, pc=%08x, reloc=%08x BFD_RELOC_ESP32ULP_PLTPC = %08x\n", (unsigned int)insn->value, (unsigned int)insn->pcrel, (unsigned int)insn->reloc, BFD_RELOC_ESP32ULP_PLTPC);
        if (insn->exp != ((void*)(0)))
        {
            //DEBUG_TRACE(" exp: %08x, sy_obj=%i\n", insn->exp->value, insn->exp->symbol->sy_obj.local);
        }
#endif
        insn = insn->next;
    }
#ifdef OBJ_ELF
    dwarf2_emit_insn(insn_size);
    //DEBUG_TRACE("dya_pass ============== >insn_size=%i\n", (unsigned int)insn_size);
#endif

    while (*line++ != '\0')
        if (*line == '\n')
            bump_line_counters();
}

I'm passing toP - frag_now->fr_literal as the offset to fix_new(). This is done in a similar way in several other architecture implementation. It is of course important to NOT increment toP during the while(), ie. we need to correct the second bug. We could also use directly frag_now_fix() to get the offset of the instruction in the code fragment, w/o forgetting to substract 4 bytes since it has been incremented by the call to frag_more() a few lines above...

Also, please correct the issue on line 2705 of binutils/stabs.c: if (*pp == ';' || pp == '\0') to be changed to if (pp == ';' || pp == '\0')

The compilers outputs a big warning (!). Issue has been correctly several versions ago in the original binutils. You may also consider removing all the other processor architecture and keep only esp32ulp, that would make the exe smaller.

feuerrot commented 6 years ago

@8785benjamin Could you add a pull request for your code changes?

8785benjamin commented 6 years ago

Yes, I have tried to done it. I have created a fork containing all the modifications in case you want to have a look https://github.com/8785benjamin/binutils-esp32ulp

dmitry1945 commented 6 years ago

Hi Benjamin, I have include your function into the binutils-esp32ulp source code. New release will be soon. Great work! Thank you very much.

8785benjamin commented 6 years ago

Thank you