Open MIvanchev opened 1 month ago
The process is complicated (more complicated than you think), it's time consuming and there's big chance of failure. If you don't have a really good reason to do it, then you probably shouldn't do it.
If that didn't discourage you, do you want to recompile a DOS game or a Windows game ?
It's a DOS game and I'm aware of the complexity :) SR seems to automate a lot of the tasks involved so I'm just generally interested in the mechanics and the usage.
I do the recompilation in several steps.
Step 1:
Compile SR
with OUTPUT_TYPE
set to OUT_ORIG
(in SR_defs.h
).
Apply SR
to the DOS executable (no SCI files are needed) - SR source.exe destination.asm
.
Compile the generated assembly with nasm (targeting DOS) and link it.
If everything works, that should generate a working DOS executable.
No recompilation is done, but it tests several things.
Step 2:
Compile SR
with OUTPUT_TYPE
set to OUT_DOS
(in SR_defs.h
).
These SCI files are (can be) used in this step (empty file is the same as non-existing file):
Apply SR
to the DOS executable (with SCI files) - SR source.exe destination.asm
.
Compile the generated assembly with nasm (targeting DOS) and link it.
If everything works, that should generate a working DOS executable.
What are the SCI files used for ?
_fixup_interpret_ascode.sci:
If an address (in program) is in this SCI file, then it's interpreted as code - it will be disassembled, etc.
You can generate a list of candidate addresses using SR --list_invalid_code_fixups=fixup_interpret_as_code.sci source.exe destination.asm
.
The list is unsorted and can contain duplicates.
Sort it, remove duplicates and then remove addresses which are not code. That will give you the final SCI file (for this step).
_fixup_interpret_ascode.sci:
If an address (in program) is in this SCI file, then it's interpreted as date - it won't be disassembled, etc.
You can generate a list of candidate addresses using SR --list_data_to_code_fixups=fixup_do_not_interpret_as_code.sci source.exe destination.asm
.
The list is unsorted and can contain duplicates.
Sort it, remove duplicates and then remove addresses which are code. That will give you the final SCI file (for this step).
_code16areas.sci: Some 32-bit DOS executables contain blocks of 16-bit code (i.e. real mode interrupt). You can define these blocks in this SCI file, so the recompiler doesn't try to disassemble them (as 32-bit code).
_noretprocedures.sci: This SCI file contains addresses of function that don't return (to the calling location). An example:
func1:
call func2
some data
func2:
pop edi
retn
In this example the address of func2
should be in this SCI file, so the recompiler doesn't interpret the data after call func2
as code.
_displacedlabels.sci: This is best described with an example:
mov bx, word [eax + addr1]
addr1:
pop ebx
retn
addr2:
some data
The first instruction was originally mov bx, word [eax + addr2 - 2]
. It's reading from some data
(the value of eax is 2 or higher), but the address in the instruction points to code and not to data.
This works in the DOS executable because instructions pop ebx
and retn
are both 1 byte long.
But when these instructions are recompiled their length is different and the first instruction would be reading wrong data.
In this example the address addr1
should be in this SCI file, displaced by 2, so the recompiler interprets the address as addr2 - 2
.
Other problems: Some executables contain crazy code like jumping in the middle of another instruction. This is not supported by the recompiler - I handle it by patching the DOS executable.
There are more steps following. I'll describe them when you get there with your recompilation.
Thank you, this is quite detailed, it's enough to get me started for sure (I already have some experience with static recompilation). My main question is whether the end result (16->32 EXE) handles stuff like keyboard interrupts and VGA buffers using a library like SDL.
The recompiler only supports 32-bit DOS executables, not 16-bit executables. The recompiler doesn't handle inputs (mouse/keyboard) or outputs (video/audio). You will have to handle that yourself - that's part of the later steps.
Hey, can someone briefly tell me what is required to get SR going? I find very little info and would appreciate some help. I suppose the main problem to solve is which addresses in the code segment are code and data and the bunch of SCI files have to do with that and naming symbols in the input/output EXE. Maybe someone can put together a rough guide for me to follow?