Open Coekjan opened 3 months ago
AHH, what can I say...
I moved my lookup-cache + regenerate-table code to the end of pass0 (before pass1) and it seems working perfectly. So maybe at least now I can by-pass original pass1-3.
this seems like a typical "relocation" problem?
Well, If you fixed your issue, hat's good :)
I am not sure if I solved the problem indeed, as I do not fully understand what the requirements (I mean which fields in the helper
or something else should be properly setup) are to run native_pass3
.
Roughly placing the "lookup-cache + regenerate-table64" between pass 0 and pass 1 does work currently, but I only tested it with ls
program. (More engineering efforts are needed to support working with general programs.)
Now I have to move the "lookup-cache + regenerate-table64" to the end of pass 1. As filling the table64
depends on some information provided by pass 1. This makes ls
program run well, but some complex programs (e.g. bash, python) are still crashing in various styles. I don't have ideas about what is happening.
Hmm, does box64 generate position-dependent native code (e.g. address in the imm of instruction) in dynablock? If so, it would require more efforts to relocate the code.
I'm not sure how you can regenerate tabme64 offset wt pass1, while arm64 (or rv64/la64) offset are known only in pass 3.
And yes, there will be some offset in native code that will relocation. There is also the effect of elf relocation that might need relocation in native code, some jumptable that use use offset from the jumptable that is allocated and so have some per-run address, etc...
I'm not sure how you can regenerate tabme64 offset wt pass1, while arm64 (or rv64/la64) offset are known only in pass 3.
I did not regenerate table64
in pass 1, but regenerate it in a new defined pass 4 (macros are copied from pass3, but write table64
only).
And yes, there will be some offset in native code that will relocation. There is also the effect of elf relocation that might need relocation in native code, some jumptable that use use offset from the jumptable that is allocated and so have some per-run address, etc...
Thanks for your reply. I would appreciate it if you could address my question:
EDIT: I already learnt that table64
contains some per-run thing and should be relocated (or regenerated)
Thanks for your reply. I would appreciate it if you could address my question:
- Where are the jumptables generated?
In custommem.c, look at JmpTable64 functions.
- Do we have a full list about what information in the dynablock has per-run value/address and where are they generated?
Nope, I didn't planned to do disk-save dynablock before a few version, so there is no infrastructure about that yet.
Nope, I didn't planned to do disk-save dynablock before a few version, so there is no infrastructure about that yet.
Do we have the full list in mind about the per-run value/address in dynablock?
Nope, I didn't planned to do disk-save dynablock before a few version, so there is no infrastructure about that yet.
Do we have the full list in mind about the per-run value/address in dynablock?
Not really. Again, the issue is that many per-run values comes from the "relocation process", that can come from the elfloader for linux process, but also from wine for exe program...
The other per-run value comes from table64 (that can probably be disabled if needed) and the jumptable (so dynablock inter link basicaly)
These two days, I suffered from some memory-corruption issue. I am not sure if it comes from box64 or my side.
In my design, my external cache system provides a dynamic library (a .so
file) for box64, and box64 will call these functions in the provided library. My functions are written in Rust and will dynamically allocate memory from its seperate heap.
When debugging, I saw that in the rust side, the address of object on heap was just behind box64 loaded memory
(( omitted ))
34800000-35d79000 r--p 00000000 103:02 17315407 box64
35d79000-35e5f000 r--p 01578000 103:02 17315407 box64
35e5f000-35e63000 rw-p 0165e000 103:02 17315407 box64
35e63000-37d32000 rw-p 00000000 00:00 0 <------- rust on-heap objects here
100000000-100003000 r--p 00000000 103:02 39584943 the-main-elf
(( omitted ))
And I observed that box64 preserved some memory regions for its own usage. So I am now not sure if the rust on-heap objects are placed in the correct space....
These two days, I suffered from some memory-corruption issue. I am not sure if it comes from box64 or my side.
UPDATE: I changed my rust-side malloc backend to mimalloc. And now the allocated objects have higher addresses:
34800000-35d79000 r--p 00000000 103:09 17315407 box64
35d79000-35e5f000 r--p 01578000 103:09 17315407 box64
35e5f000-35e63000 rw-p 0165e000 103:09 17315407 box64
35e63000-37d11000 rw-p 00000000 00:00 0 <------- box_{malloc,realloc,free} here
100000000-100003000 r--p 00000000 103:09 39584943 main-elf
100003000-100006000 rw-p 00000000 00:00 0
57a94000000-57ad4000000 rw-p 00000000 00:00 0 <------- rust mimalloc here
So I can now assert that rust heap & box64 heap have no overlaps.
However, the memory-corruption issue is still existing. As long as my rust dylib allocates some objects on its own heap, the box64 will have wrong behaviors, resulting in python3.12 failure. Does anyone have any idea about this issue?
However, the memory-corruption issue is still existing. As long as my rust dylib allocates some objects on its own heap, the box64 will have wrong behaviors, resulting in python3.12 failure. Does anyone have any idea about this issue?
UPDATE: I finally found that the issue comes from libc function realpath
. If my rust code called fs::canonicalize
(finally equals to realpath
), the memory-corruption issue happened.
So it must be something wrong happened when calling realpath
in my dynamic library...
Background
I am implementing a external cache-system (my research project, wip), so that the generated dynablock for program A could be reused in the other programs or the next time program A starts. This idea could help to by-pass the complex
native_pass
es.Currently the dynablock can be stored in the cache system correctly and box64 can lookup the external cache correctly. I found however the fetched external cache (dynablock) can not be used directly, as it contains many position-dependent information. For example (if I am wrong, please correct me):
void *
word of the block: easy to regeneratenext
pointer in the block, and thejmpnext
address: easy to regeneratetable64
in the block: hard to regenerate???https://github.com/ptitSeb/box64/blob/27a8d19f31327c5c620c5037b4534b0cb200a2a9/src/dynarec/dynarec_native.c#L542-L551
SO, I think the cache can be effectively reused if we can have a (cheap) way to regenerate the
table64
.But, how to?
I tried to introduced a
native_pass4
to regenerate thetable64
. To define this new pass, I basically created such header file:and adapted some conditional-compilation directives in the codebase, e.g.:
Before calling
native_pass4
, I set thehelper
(just like before callingnative_pass3
in the original code path):However, this seemed not working well. The
table64
seemed not generated correctly. I think it could be something wrong at thehelper
, because I actually did not do all actions onhelper
as the original code path does beforenative_pass3
.So, my question is, is it possible to regenerate the
table64
? :thinking:I would appreciate it if you can help me to understand how
FillBlock64
generatestable64
or give some hints to regenerate thetable64
for an existing dynablock.