sifive / freedom-metal

Bare Metal Compatibility Library for the Freedom Platform
Other
152 stars 47 forks source link

Fix the compact patterns for crt0.s and entry.s. #417

Open Nelson1225 opened 3 years ago

Nelson1225 commented 3 years ago

Hi Guys,

There are two problems here that cause testcases to runtime fail for freedom-e-sdk,

1: auipc a0, %pcrel_hi(global_pointer__) addi a0, a0, %pcrel_lo(1b) ld gp, 0(a0) add gp, gp, a0 la gp, global_pointer$ .option pop

/ Stack pointer is expected to be initialized before _start /

/ If we're not hart 0, skip the initialization work / la t0, __metal_boot_hart bne a0, t0, _skip_init

Obviously my code is wrong here, a0 usually used for function parameters, so if I use it to do anything special, then the value of a0 will be changed. Maybe t0/t1 is more suitable.

When rodata is placed nearly to text, then we can use la to access metal_segment_data_source_start directly; When the rodata is placed nearly to rwdata, then use the lla.gprel is more suitable. But once the rodata is placed far away from both text and rwdata, generally we should use la.got.gprel to access it. Unfortunately, la.got.gprel cannot work for metal_segment_data_source_start, if LMA and VMA are different in the linker script. Since we need to move the data, including GOT sections, from the LMA to VMA, then we can get the right values of got entries. Therefore, use la.got.gprel to access the metal_segment_data_source_start will get the uninitilaize value.

One of the solution is that - use compact stub to access the metal_segment_data_source_start, just like what we did above, to initialize gp. But we need four instructions and one 8 bytes stub, so the code size will be slightly larger (4 * 4 + 8 = 24 bytes, originally, the la.got.gprel may be relaxed to a 4 bytes ld, so it will be 20 bytes more).

e-puerto commented 3 years ago

@Nelson1225 I have some difficulties understanding why using a0 for compact patterns will cause runtime fail.

Nelson1225 commented 3 years ago

@Nelson1225 I have some difficulties understanding why using a0 for compact patterns will cause runtime fail.

It is probably fine in entry.s, but in crt0.s,

/ Compact pattern changes the value of a0. / 1: auipc a0, %pcrel_hi(global_pointer) addi a0, a0, %pcrel_lo(1b) ld gp, 0(a0) add gp, gp, a0 la gp, __global_pointer$

/ If we're not hart 0, skip the initialization work / la t0, __metal_boot_hart bne a0, t0, _skip_init

The a0 will be used later in bne, so if we use a0 as a template register for compact pattern, then a0 will be changed, which seems wrong.