Qcloud1223 / COMP461905

Course project for Operating Systems at XJTU: A basic x86-64 dynamic linker.
13 stars 4 forks source link

[test1 problem] Where to fill the symbol's address #9

Closed SoullAngle closed 2 years ago

SoullAngle commented 2 years ago

I have read the instruction of test 1. I know there are 4 steps to finish.

  1. find relocation table;
  2. process each relocatoin entry;
  3. find the address of referred symbol (in test 1 the symbol is "puts");
  4. add the address with addend and fill (in test 1 the addend is 0)

We can see that in .rela.plt segment, the r_offset is 0x201018. Now after reading the instruction, I have write a program which could find the address of "puts" successfully, even could printf "puts". My Problem : where should I fill the address in? We already the r_offset is 0x201018, should I write the address in original file(lib1.so) or in the virtual memory? I use gdb to printf my result as figure below. By the way, I write the address in virtual memory.

Some text Some text
Qcloud1223 commented 2 years ago

should I write the address in original file(lib1.so) or in the virtual memory

I think you might misunderstand mmap.

If you want to write to a file, you could use write(int fd, const void *buf, size_t count) syscall, which writes buffer into an fd.

mmap allows you to map part of a file into a process's virtual memory pages. Whether modifications you make will influence the original file is decided by flags of mmap:

The flags argument The flags argument determines whether updates to the mapping are visible to other processes mapping the same region, and whether updates are carried through to the underlying file. This behavior is determined by including exactly one of the following values in flags:

We definitely do not want our modifications to VM pages carried to the underlying files, for we might use these files in other processes or another base address. That's why you need to take care of the flags when you call mmap.

So when you mmap a file, there is no 'address in original file'. A page in VM can carry the content of a file, and if MAP_SHARED is set, every write to that page will also modify its file. You need to distinguish 'unintended' file writing in mmap and 'intended' writing in write.

I use gdb to printf my result as figure below

Also a intuitive debug advice: you should feel alarmed when you see some unaligned address, that is, an address not ends with 0, 4, 8, c. I notice that your address of puts is an absolute one, but not aligned to anything. The correct address to write is probably not that one.

SoullAngle commented 2 years ago

The correct address to write is probably not that one.

Well, Maybe I misunderstanding the instruction, I should fill the "puts" function's address in blank, but not "puts" string's address in blank, right? Now, I want to provide more info about my bug. In the test 1, we use "foo" to test. And when the test.c call the function "foo", it will jump to section .plt and the section .plt will jump to section .got.plt , where store the real address of the function. But because I always recieve SIGSEGV, I try to use gdb to trace where the bug is step by step. I found out that when the test.c try to call function "foo", it will move to section .plt and then got SIGSEGV, but I never modify section .plt. The blank I should fill is in section .got.plt, right? I want to know why I will recieve SIGSEGV in section .plt, or I want to get some instruction of section .plt. I guess it probably store some place to jump, but I can't understand what this section mean as figure 2 below. The first figure is where I recieved the bug using gdb, and the second one is the data of section .plt . By the way, the base address is 0x7ffff7ff4000, which means the 0x7ffff7ff4570 is in the section .plt in deed.

Qcloud1223 commented 2 years ago

I should fill the "puts" function's address in blank, but not "puts" string's address in blank

Of course. You may want to review my slides or the textbook on what to fill in the blank. The blank is the GOT entry for puts.

I want to know why I will recieve SIGSEGV in section .plt

First, you could use objdump -d library_file to disassemble sections as code(if applicable). I will show my disassembled .plt section below:

捕获

At 0x7ffff7ff4570, you jump to the second entry of .plt section, which is correct. If you look at the address at 0x1030 in the image above, you will find that it is going to jump to 0x4018, which is the GOT entry address of puts.

I have two assumption on your bug: first, the address you wrote to might be incorrect. As I saw in your image, the address of GOT entry of puts has an offset of 0x201018. Given your base address is 0x7ffff7ff4000, you should write to 0x7fffff81f5018.

Second, the address you filled in might be incorrect. I'm sorry I've introduce dlopen and dlsym in test1, but it is meant to help you bypass the complexity of relocation. You can try to print if the address returned by dlsym, and see if it is valid(dlsym return NULL if no symbol is found).

SoullAngle commented 2 years ago

Well, much appreciate! I use objdump -d to read disassemble sections and find my true problem.The real problem is that I mmap the segment in test 0 is wrong! The readelf -l result is in figure 1 as below. The align is 0x200000. We should mmap the second LOAD segment in _baseaddr + align instead of loading it directly(behind the first segment page). And I think it's very important to load the segment correctly. Or it will have an influence on the remaining tests. After I correct the mmap in test 0, the test 1 is also passed magically! Others could use it as a reference.

SoullAngle commented 2 years ago

By the way, I have found out several ways to pass test 0. But lots of them are not appropriate and although they could pass the test 0, they will affect the test 1 and the other remaining test...