radareorg / radare2

UNIX-like reverse engineering framework and command-line toolset
https://www.radare.org/
GNU Lesser General Public License v3.0
19.85k stars 2.95k forks source link

Refactor of elf.c and remove reference to section name in get_import_addr #16530

Closed ghost closed 4 years ago

ghost commented 4 years ago

Your checklist for this pull request

Detailed description ~2 tests fail.~ symbols with no sections header information 3: i believe the test is wrong, before the refacto the return addr of get_import_addr was always -1 because there was no section name.

ELF: imports partial: i believe the return value (get_import_addr_x86_manual) when to .plt.got entry can't be found should be -1 not a garbage value.

PS: if the return value of get_import_addr_x86_manual is the variable plt_sym_addr another test fail.

Test plan Run tests

Closing issues

12732

ret2libc commented 4 years ago

Ah, @08A please rebase on top of current master as well.

ghost commented 4 years ago

What do you think of ELF: imports partial test?

ghost commented 4 years ago

Should i rewrite symbols with no sections header information 3?

ret2libc commented 4 years ago

Should i rewrite symbols with no sections header information 3?

Yes, you should fix the test. Your results seem good.

What do you think of ELF: imports partial test?

I see these changes:

--- /tmp/old    2020-04-14 12:15:46.398818567 +0200
+++ /tmp/new    2020-04-14 12:15:13.556235229 +0200
@@ -17,7 +17,7 @@
 15  0x00051fc8 GLOBAL FUNC       unlink
 16  0x00051fd0 GLOBAL FUNC       strncpy
 17  0x00051fd8 GLOBAL FUNC       strncmp
-18  0x00443a32 WEAK   NOTYPE     _ITM_deregisterTMCloneTable
+18  0x00000000 WEAK   NOTYPE     _ITM_deregisterTMCloneTable
 19  0x00051fe0 GLOBAL FUNC       _exit
 20  0x00051fe8 GLOBAL FUNC       strcpy
 21  0x00051ff0 GLOBAL FUNC       mkdir
@@ -85,7 +85,7 @@
 83  0x000521e0 GLOBAL FUNC       ftello64
 84  0x000521e8 GLOBAL FUNC       acos
 85  0x000521f0 GLOBAL FUNC       read
-86  0x00443a32 GLOBAL FUNC       __libc_start_main
+86  0x00000000 GLOBAL FUNC       __libc_start_main
 87  0x000521f8 GLOBAL FUNC       srand
 88  0x00052200 GLOBAL FUNC       memcmp
 89  0x00052208 GLOBAL FUNC       fgets
@@ -106,7 +106,7 @@
 104 0x00052280 GLOBAL FUNC       sigemptyset
 105 0x00052288 GLOBAL FUNC       ftell
 106 0x00052290 GLOBAL FUNC       feof
-107 0x00443a32 WEAK   NOTYPE     __gmon_start__
+107 0x00000000 WEAK   NOTYPE     __gmon_start__
 108 0x00052298 GLOBAL FUNC       umask
 109 0x000522a0 GLOBAL FUNC       fopen64
 110 0x000522a8 GLOBAL FUNC       strtol
@@ -168,7 +168,7 @@
 166 0x00052468 GLOBAL FUNC       perror
 167 0x00052470 GLOBAL FUNC       strtok
 168 0x00052478 GLOBAL FUNC       sysconf
-169 0x00443a32 WEAK   NOTYPE     _Jv_RegisterClasses
+169 0x00000000 WEAK   NOTYPE     _Jv_RegisterClasses
 170 0x00052480 GLOBAL FUNC       towlower
 171 0x00052488 GLOBAL FUNC       rename
 172 0x00052490 GLOBAL FUNC       tgetstr
@@ -190,7 +190,7 @@
 188 0x00052510 GLOBAL FUNC       fwrite
 189 0x00052518 GLOBAL FUNC       lseek64
 190 0x00052520 GLOBAL FUNC       __fprintf_chk
-191 0x00443a32 WEAK   NOTYPE     _ITM_registerTMCloneTable
+191 0x00000000 WEAK   NOTYPE     _ITM_registerTMCloneTable
 192 0x00052528 GLOBAL FUNC       sqrt
 193 0x00052530 GLOBAL FUNC       sigaltstack
 194 0x00052538 GLOBAL FUNC       strerror

I think your changes are still good even in this case. The old values may interpreted as real addresses, but they are not. They are just useless values, so I think it's better to have 0s. IDA creates a fake section for those entries, but I think 0 for now is good. Does that sound good?

ghost commented 4 years ago

I think your changes are still good even in this case. The old values may interpreted as real addresses, but they are not. They are just useless values, so I think it's better to have 0s. IDA creates a fake section for those entries, but I think 0 for now is good. Does that sound good?

perfect i will fix that

lgtm-com[bot] commented 4 years ago

This pull request introduces 2 alerts when merging a44e1742d8de5eb0641f63d70a37f9e6baf7e820 into 08210f3d0eefaebde8134ddf3d33599c47694b99 - view on LGTM.com

new alerts:

radare commented 4 years ago

what about -1 instead 0 ?

radare commented 4 years ago

otherwise lgtm

ret2libc commented 4 years ago

I'm not sure whether you already did that, but what I meant was to remove all style fixes, not only the macro ones. The diff is full of whitespace changes which make the review very hard. Please try to keep the PR focused on one "thing" so it's easier to understand what's changing ;)

ghost commented 4 years ago

I'm not sure whether you already did that, but what I meant was to remove all style fixes, not only the macro ones. The diff is full of whitespace changes which make the review very hard. Please try to keep the PR focused on one "thing" so it's easier to understand what's changing ;)

I tried to fix some white space problem. Some change in indentation can't be revert, at the beginning this was not indented with tab.

ghost commented 4 years ago

what about -1 instead 0 ?

You talk about the default value of dyn_info entries?

ghost commented 4 years ago

For lgtm i was thinking about removing the issue and add an issue for the last section name references in get_import_addr

lgtm-com[bot] commented 4 years ago

This pull request introduces 2 alerts when merging 161a4ff6038c515a886b95182d2a1c4b4a8ec650 into 1aef2513ab0b40535c33e502094d5c11f34833c1 - view on LGTM.com

new alerts:

ret2libc commented 4 years ago

@08A could you cherry pick https://github.com/ret2libc/radare2/commit/53d56cc7f847344280bb6d07e9f3e8e4c1ed25b1 please? It will remove a lot of unneeded changes and will make the diff smaller. For the future, it's better if you separate fixing stuff from doing mass style-fixes.

lgtm-com[bot] commented 4 years ago

This pull request introduces 2 alerts when merging 211b17121fba48ab9268646addb0b805f234f683 into 0b7948a643d0a9ede9f30ed795b5b22f0ee01f51 - view on LGTM.com

new alerts:

ret2libc commented 4 years ago

@08A do you need anything from me on this? If you have any concern, please raise them ;)

ghost commented 4 years ago

@08A do you need anything from me on this? If you have any concern, please raise them ;)

I just found that i forget about the SPARC architecture, but we don't have any test for this architecture. If you could craft on executable for testing purpose, it will help me implement this feature faster.

You talk about *.o files do you have an example? (a test)

ret2libc commented 4 years ago

@08A do you need anything from me on this? If you have any concern, please raise them ;)

I just found that i forget about the SPARC architecture, but we don't have any test for this architecture. If you could craft on executable for testing purpose, it will help me implement this feature faster.

I'm not really familiar with SPARC and I don't have access to such arch, but I think we have some SPARC ELFs in tests. See

./elf/analysis/elf-sparc-execstack: ELF 32-bit MSB executable, SPARC32PLUS, V8+ Required, total store ordering, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, for GNU/Linux 2.6.18, BuildID[sha1]=27daaa46422285919a67b3e25c3851313a2697db, stripped
./elf/elf-Linux-SparcV8-bash: ELF 32-bit MSB executable, SPARC32PLUS, V8+ Required, total store ordering, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, for GNU/Linux 2.6.15, BuildID[sha1]=dd5aab02bdf7b8c649fb24bc0bd5544a79c7aec0, stripped
./elf/elf-solaris-sparc-ls: ELF 32-bit MSB executable, SPARC, version 1 (SYSV), dynamically linked, interpreter /usr/lib/ld.so.1, with debug_info, not stripped
./fuzzed/r2-SEGV-d48-c31-e3c.dms: ELF 64-bit LSB no file type, SPARC, (SYSV)

You talk about *.o files do you have an example? (a test)

Mmm... maybe I was wrong on this.

Anyway, please before adding/fixing anything else, try to remove all those whitespaces/unrelated changes. As mentioned above, you should be able to cherry-pick https://github.com/ret2libc/radare2/commit/53d56cc7f847344280bb6d07e9f3e8e4c1ed25b1 . Also, rebase on top of current master (or merge master into your branch).

ghost commented 4 years ago

@ret2libc Are you satisfied with the diff size? For the sparc implementation i will try to do that later this week.

ret2libc commented 4 years ago

@ret2libc Are you satisfied with the diff size? For the sparc implementation i will try to do that later this week.

Much better now, yes! Thought there are still some changes that don't belong here, probably some wrong merge or old rebase ( see .github files in the diff). Thanks!

lgtm-com[bot] commented 4 years ago

This pull request introduces 2 alerts when merging ca76611fb6ffd0ae3401d53937a402d0e2098056 into b4b3013197e641dc7be8f8f52da9143832a7fdf6 - view on LGTM.com

new alerts:

ghost commented 4 years ago

@ret2libc I added support for the sparc arch, but i have not enough knowledge (i never reversed sparc exec) to add a test, do you know someone that could verify the result and maybe create a new test?

radare commented 4 years ago

What's left to review/change in here? imho lgtm

ghost commented 4 years ago

What's left to review/change in here? imho lgtm

@radare a test for the sparc arch

ghost commented 4 years ago

Okay, thanks for your feedback.

For the position, i choose to not include this information inside the struct, because i didn't find any documentation which explain that the position inside the relocation table is the same that the position inside the got. So we need to recompute every time the position (the cost is not really expensive).

I understand that the majority of the code can be confusing so i will try to explain my choices.

get_import_addr_arm

In arm elf, the plt_addr is stored inside the got_entry, so i can read a random entry and get the based addr of the plt section. The position is compute with this formule: (relocation_rva - got_base_addr - nb_empty_entry * size_got_entry) / size_got_entry Indeed the entry number 1, 2, 3 are not relevant.

get_import_addr_mips

In mips the got addr is stored inside a special dynamic entry anmed DT_MIPS_PLTGOT (when there is some relocation) The position computation is like the arm one (expect that there is only 2 irrelevant entry).

get_import_addr_riscv

Riscv is like arm (get_got_entry).

get_import_sparc

We only support R_SPARC_JMP_SLOT. If the relocation is supported the entry inside the got return almost the exact position in the plt of the import_addr we only need some small adjustment (-size of the instruction to jump). It is cleaner than use some position.

get_import_ppc

Almost the same, pos is from the base addr of the plt (since the rva has a ref inside the plt).

get_import_x86

less modified, the code is self-explanatory.

Conclusion

As you can see i tried to be more logic. I understand that your are scared that the new code is broken. The main problem is not that we have a new code but that you don't trust the test suite. Maybe it is a small sign that the test suite is not exhaustive. I personally don't have all the competence to add tests but if you know someone that is willing to do, let me know.

I will try to reduce again the diff size.

ghost commented 4 years ago

There is to much dependencies between to many function, it is why the size a the diff is so big. I don't know how to make it smaller.

ghost commented 4 years ago

I believe i revert all the esthetic change, now all we have is algorithm modification. imho algorihm that should have been changed long time ago.

ghost commented 4 years ago

Of course I don't trust the test suite :) It's hard to cover everything. I think, for example, we don't have any MIPS bin with relocations.

I found one bin with relocation.

readelf --use-dynamic -r test/bins/elf/analysis/mipsbe-ubusd
ret2libc commented 4 years ago

Of course I don't trust the test suite :) It's hard to cover everything. I think, for example, we don't have any MIPS bin with relocations.

I found one bin with relocation.

readelf --use-dynamic -r test/bins/elf/analysis/mipsbe-ubusd

Nice catch! I did not find it because I was using rabin2 :) And rabin2, both on master and in your PR, does not find them :( Anyway, as that's not a regression I don't consider it blocking for this PR

lgtm-com[bot] commented 4 years ago

This pull request introduces 2 alerts when merging 06cd49bf05d2b87e1e73f03087dc4661480b15b5 into 1b56d63df213b988674778b5c7b0c5a8debdc2a1 - view on LGTM.com

new alerts:

ghost commented 4 years ago

* 2 for Comparison result is always the same

I don't see why lgtm isn't happy? Each comparison are valid.

ret2libc commented 4 years ago
* 2 for Comparison result is always the same

I don't see why lgtm isn't happy? Each comparison are valid.

Oh, tricky one :D

ut64 base = r_buf_read_ble32_at (bin->b, p_plt_addr, bin->endian);
if (base == UT64_MAX) {

r_buf_read_ble32_at returns a ut32 value, not a ut64 one. And in case there are errors, it returns UT32_MAX, not UT64_MAX. So LGTM is correctly complaining about this.

ut64 addr = BREADWORD (bin->b, p_sym_got_addr);
return (!addr || addr == UT64_MAX) ? UT64_MAX : addr;

When R_BIN_ELF64 is not defined, BREADWORD will use the r_buf_read_ble32_at, which, again, returns UT32_MAX in case of error and not UT64_MAX.

ghost commented 4 years ago

@ret2libc It is fixed.

ret2libc commented 4 years ago

I will review this later and merge if ok.