Open mkst opened 1 year ago
Hi!
The opcode you found is from a 386 instruction - implementing the 386 instructions will be a significant effort, yes - it will be certainly easier to take an existing 386 CPU emulator and put it in place of the existing 286 one.
The 386 CPU added all the 32 bit instructions, the 32 bit protected-mode, paging, new exceptions, etc.
My original intention with EMU2 was to run old 16-bit DOS programs, those work in a 8088 or 8086 CPU, and don´t need newer opcodes.
Have Fun!
Further explaining, the 66h is not an opcode, it is an opcode prefix, that changes the next instruction size from 16bit to 32bit, so to implement it you will need to implement all existing 16 bit instructions in a 32 bit version.
Hello @dmsc,
For context, it seems there is a version of this CC1PSX.EXE
at https://archive.org/details/psyq-sdk .
I am a bit surprised that the program is executing a 32-bit instruction, apparently (?) without checking beforehand that it is running on a 32-bit-capable platform (alternatively, there might have been a check that somehow went wrong).
Thank you!
There are a few versions of the compiler. We are looking at versions 3.5 and 3.6, which only appear to exist as 16bit binaries.
For a little more context, this is part of a matching decompilation project, where specific compiler versions are required to produce the correct result.
Hello @mkst,
What are the last few (10 or so) instructions that are run before the unimplemented opcode 66
? If you set a EMU2_DEBUG=cpu
environment variable, then emu2
should save the sequence of (emulated) instructions into a file.
It is possible that CC1PSX.EXE
actually requires 32-bit capability to run. I believe the archive.org
version uses a 32-bit DOS extender and switches to 32-bit mode, even though it starts up in 16-bit mode.
Thank you!
Here's the last 20 commands (I've also attached the whole log file should that be of interest). It's all Greek to me!
$ tail -20 /tmp/CC1PSX.EXE-cpu.0.log
AX=0000 BX=0049 CX=0AFC DX=01E0 SP=FEC6 BP=FFD6 SI=0000 DI=A917 DS=1079 ES=0040 SS=1079 CS=0097 IP=3BB2 NV UP EI PL ZR NA PE NC 0097:3BB2 7406 JZ 3BBA
AX=0000 BX=0049 CX=0AFC DX=01E0 SP=FEC6 BP=FFD6 SI=0000 DI=A917 DS=1079 ES=0040 SS=1079 CS=0097 IP=3BBA NV UP EI PL ZR NA PE NC 0097:3BBA 837EF600 CMP WORD PTR [BP-0A],00
AX=0000 BX=0049 CX=0AFC DX=01E0 SP=FEC6 BP=FFD6 SI=0000 DI=A917 DS=1079 ES=0040 SS=1079 CS=0097 IP=3BBE NV UP EI PL ZR NA PE NC 0097:3BBE 752F JNZ 3BEF
AX=0000 BX=0049 CX=0AFC DX=01E0 SP=FEC6 BP=FFD6 SI=0000 DI=A917 DS=1079 ES=0040 SS=1079 CS=0097 IP=3BC0 NV UP EI PL ZR NA PE NC 0097:3BC0 8D46EE LEA AX,[BP-12]
AX=FFC4 BX=0049 CX=0AFC DX=01E0 SP=FEC6 BP=FFD6 SI=0000 DI=A917 DS=1079 ES=0040 SS=1079 CS=0097 IP=3BC3 NV UP EI PL ZR NA PE NC 0097:3BC3 50 PUSH AX
AX=FFC4 BX=0049 CX=0AFC DX=01E0 SP=FEC4 BP=FFD6 SI=0000 DI=A917 DS=1079 ES=0040 SS=1079 CS=0097 IP=3BC4 NV UP EI PL ZR NA PE NC 0097:3BC4 E86F13 CALL 4F36
AX=FFC4 BX=0049 CX=0AFC DX=01E0 SP=FEC2 BP=FFD6 SI=0000 DI=A917 DS=1079 ES=0040 SS=1079 CS=0097 IP=4F36 NV UP EI PL ZR NA PE NC 0097:4F36 55 PUSH BP
AX=FFC4 BX=0049 CX=0AFC DX=01E0 SP=FEC0 BP=FFD6 SI=0000 DI=A917 DS=1079 ES=0040 SS=1079 CS=0097 IP=4F37 NV UP EI PL ZR NA PE NC 0097:4F37 8BEC MOV BP,SP
AX=FFC4 BX=0049 CX=0AFC DX=01E0 SP=FEC0 BP=FEC0 SI=0000 DI=A917 DS=1079 ES=0040 SS=1079 CS=0097 IP=4F39 NV UP EI PL ZR NA PE NC 0097:4F39 56 PUSH SI
AX=FFC4 BX=0049 CX=0AFC DX=01E0 SP=FEBE BP=FEC0 SI=0000 DI=A917 DS=1079 ES=0040 SS=1079 CS=0097 IP=4F3A NV UP EI PL ZR NA PE NC 0097:4F3A 57 PUSH DI
AX=FFC4 BX=0049 CX=0AFC DX=01E0 SP=FEBC BP=FEC0 SI=0000 DI=A917 DS=1079 ES=0040 SS=1079 CS=0097 IP=4F3B NV UP EI PL ZR NA PE NC 0097:4F3B 8C0EF00E MOV [0EF0],CS
AX=FFC4 BX=0049 CX=0AFC DX=01E0 SP=FEBC BP=FEC0 SI=0000 DI=A917 DS=1079 ES=0040 SS=1079 CS=0097 IP=4F3F NV UP EI PL ZR NA PE NC 0097:4F3F 8C1EE80E MOV [0EE8],DS
AX=FFC4 BX=0049 CX=0AFC DX=01E0 SP=FEBC BP=FEC0 SI=0000 DI=A917 DS=1079 ES=0040 SS=1079 CS=0097 IP=4F43 NV UP EI PL ZR NA PE NC 0097:4F43 8C16F40E MOV [0EF4],SS
AX=FFC4 BX=0049 CX=0AFC DX=01E0 SP=FEBC BP=FEC0 SI=0000 DI=A917 DS=1079 ES=0040 SS=1079 CS=0097 IP=4F47 NV UP EI PL ZR NA PE NC 0097:4F47 B88716 MOV AX,1687
AX=1687 BX=0049 CX=0AFC DX=01E0 SP=FEBC BP=FEC0 SI=0000 DI=A917 DS=1079 ES=0040 SS=1079 CS=0097 IP=4F4A NV UP EI PL ZR NA PE NC 0097:4F4A CD2F INT 2F
AX=1687 BX=0049 CX=0AFC DX=01E0 SP=FEB6 BP=FEC0 SI=0000 DI=A917 DS=1079 ES=0040 SS=1079 CS=0000 IP=002F NV UP DI PL ZR NA PE NC 0000:002F ?? IRET (EMU 2F)
AX=1687 BX=0049 CX=0AFC DX=01E0 SP=FEBC BP=FEC0 SI=0000 DI=A917 DS=1079 ES=0040 SS=1079 CS=0097 IP=4F4C NV UP EI PL ZR NA PE NC 0097:4F4C 23C0 AND AX,AX
AX=1687 BX=0049 CX=0AFC DX=01E0 SP=FEBC BP=FEC0 SI=0000 DI=A917 DS=1079 ES=0040 SS=1079 CS=0097 IP=4F4E NV UP EI PL NZ NA PE NC 0097:4F4E 7561 JNZ 4FB1
AX=1687 BX=0049 CX=0AFC DX=01E0 SP=FEBC BP=FEC0 SI=0000 DI=A917 DS=1079 ES=0040 SS=1079 CS=0097 IP=4FB1 NV UP EI PL NZ NA PE NC 0097:4FB1 33C0 XOR AX,AX
AX=0000 BX=0049 CX=0AFC DX=01E0 SP=FEBC BP=FEC0 SI=0000 DI=A917 DS=1079 ES=0040 SS=1079 CS=0097 IP=4FB3 NV UP EI PL ZR NA PE NC 0097:4FB3 66 DB 66
Hi!
From that log, the instruction at that address is "XOR EBP, EBP", this is the disassembly of the function end:
0097:211F 33C0 XOR AX,AX
0097:2121 6633ED XOR EBP,EBP <= 80386 INSTRUCTION
0097:2124 5F POP DI
0097:2125 5E POP SI
0097:2126 5D POP BP
0097:2127 C3 RET
So, the instruction does not have any side effects (it is followed by a POP BP), it only crashes on any processor without 32 bit support, so this is intentional.
Note that just above there is an INT 27h, this checks for any DOS extender loaded, if the function returns that the CPU does not implement 32 bit opcodes, it also jumps to the faulting instruction:
0097:20B5 B88716 MOV AX,1687
0097:20B8 CD2F INT 2F
0000:002F ?? IRET (EMU 2F)
0097:20BA 23C0 AND AX,AX
0097:20BC 7561 JNZ 211F
0097:20BE 90 NOP
0097:20BF 90 NOP
0097:20C0 F7C30100 TEST BX,0001
0097:20C4 7459 JZ 211F
0097:211F 33C0 XOR AX,AX
0097:2121 633ED XOR EBP,EBP
And finally, if the offending instruction is replaced with a NOP, this is the result:
~/psyq/psyq$ emu2 CC1PSX.EXE
CPU must be a 386 to run this program.
~/psyq/psyq$
Have Fun!
Does this suggest that the program could be patched to be treated as a 32bit app from the get go (and thus runnable via wine)? Or is this a fundamentally 16bit app with 32bit extension (and therefore 386 emulation is the only way to go)?
Thanks for your time & responses!
You probably need a DOS extender and a 80386 emulator, I don´t know if it uses a DSO extender to work in protected-mode or simply to access extended memory. Try running it in DOSBOX with DPMI disabled.
Hello @dmsc,
I tried the archive.org
version of cc1psx.exe
, and it did work under DOSBox (though this was not initially obvious — it seemed to hang, but really it was just waiting for input from the keyboard).
Thank you!
I should have been more clear, we can run the CC1PSX compiler via dosemu/dosbox BUT we are after a lighter solution. We are finding that the process seems to take 5 seconds to run a compilation command (still trying to determine what exactly is going wrong, it only takes 500ms when running it by hand) and I wondered if emu2 could be a lightweight alternative..
@mkst : what exactly do you mean by running cc1psx.exe
"by hand"?
(I suppose if you run the program under different conditions, then yes, you are going to get some extra overhead in different places. So perhaps what you simply need to do is to figure out where the extra overhead is coming from when you start up cc1psx.exe
via DOSEmu or DOSBox. Unfortunately it does not seem that emu2
can easily support running this program in the short term...)
Thank you!
Hah I've been trying to keep the details light to avoid derailing this Issue... but here goes.
We have a PR to our project (https://github.com/decompme/decomp.me/pull/651) that adds dosemu2 in order to support these old compilers that won't run under WINE.
When testing the PR locally I found that the execution of dosemu2 is taking ~5 seconds. strace
throws out lots of stuff but nothing is hanging it's just doing a bunch of stuff. When I simply call the same command from a simple python script, execution only takes 500ms. So I need to do some further investigation to try to determine why a subprocess.run call in our app takes 5 seconds, but a subprocess.run in a standalone script takes 500ms (equivalent time as running it by hand, from the shell).
As I was getting nowhere with figuring out the slowdown, I went on a hunt for alternatives to dosemu2, of which emu2 looked like a potential candidate - however as it only supports 16bit instructions (and this CC1PSX requires 32bit support but presents itself as a 16bit exe) it turns out it's not going to be a drop-in replacement.
As another tangent, we have our own lightweight alternative to WINE (https://github.com/decompals/wibo) which is only for cli applications (it has some magic to kick off the exe and then intercepts all DLL calls with our own replacements) - do you know if something similar would be possible for dos? i.e. intercepting systemcalls rather than trying to emulate an x86 processor?
Hi!
As another tangent, we have our own lightweight alternative to WINE (https://github.com/decompals/wibo) which is only for cli applications (it has some magic to kick off the exe and then intercepts all DLL calls with our own replacements) - do you know if something similar would be possible for dos? i.e. intercepting systemcalls rather than trying to emulate an x86 processor?
You will need to emulate at least a 80386, as DOS applications rely on x86 16 bit support and real-mode, and this is very difficult to manage that in current x86 processors. Also, by running natively, you can't run your program in other CPU architectures.
For example, the cc1psx.exe
access I/O ports 21 and A1, this will need port emulation. And many DOS programs write directly to the screen, emu2 has a text-screen emulator so it can keep the DOS I/O, BIOS I/O and direct screen access synchronized.
I refrained to add 80386 support to emu2 because it opens a can of worms - many DOS programs when detecting a 386 try to start in protected mode, or even in unreal mode. I have a branch that tries to add 80286 protected mode support, and I could not make it work with many programs, as 286 protected mode is not that documented.
Hello @dmsc,
this is the disassembly of the function
Nice analysis!
it only crashes on any processor without 32 bit support if the offending instruction is replaced with a NOP, this is the result
Out of curiosity, on real hardware there's no crash, instead an 80186 or 80286 CPU will generate an invalid opcode exception (INT 6), rather than ignore the 66h addr32 prefix like the 8086 does, correct? Does MSDOS (or the BIOS) just have a null (IRET only) handler for this vector, causing the program to resume execution just past the 66h byte for what will then be a XOR BP,BP
instruction, after which the "CPU must be a 386 to run this program" message is displayed?
For example, the cc1psx.exe access I/O ports 21 and A1
Wow, so an MSDOS .exe talks to the PC programmable interrupt controller... Do you find this and other hardware port I/O is somewhat common in programs that attempt to manage XMS or APIs that came later in DOS?
Thank you!
Hi!
Nice analysis!
it only crashes on any processor without 32 bit support if the offending instruction is replaced with a NOP, this is the result
Out of curiosity, on real hardware there's no crash, instead an 80186 or 80286 CPU will generate an invalid opcode exception (INT 6), rather than ignore the 66h addr32 prefix like the 8086 does, correct?
Yes. emu2 does generate the INT 6 (actually a TRAP in a 80286), and the default handler terminates the application. In a 8086, the opcode 66h is an alias for 76h, ("JAE"), consuming the next byte and generally executing bogus code.
Does MSDOS (or the BIOS) just have a null (IRET only) handler for this vector, causing the program to resume execution just past the 66h byte for what will then be a
XOR BP,BP
instruction, after which the "CPU must be a 386 to run this program" message is displayed?
No, the 80286 TRAP should set the CS:IP saved in the stack to the faulting instruction, so a IRET will return to the same instruction, causing a fault again, so the PC will lockup.
For example, the cc1psx.exe access I/O ports 21 and A1
Wow, so an MSDOS .exe talks to the PC programmable interrupt controller... Do you find this and other hardware port I/O is somewhat common in programs that attempt to manage XMS or APIs that came later in DOS?
Typical is accessing the PIC and the keyboard controller - this is needed to exit from protected mode in 80286.
For a command line program, you can ignore all those accesses, most are there because they are part of the C standard library at initialization. Most C runtimes start by saving the interrupt handlers and accessing the PIC to ensure that software interrupts are properly handled.
Have Fun!
Hello @dmsc,
As another tangent, we have our own lightweight alternative to WINE (https://github.com/decompals/wibo) which is only for cli applications (it has some magic to kick off the exe and then intercepts all DLL calls with our own replacements) - do you know if something similar would be possible for dos? i.e. intercepting systemcalls rather than trying to emulate an x86 processor?
You will need to emulate at least a 80386, as DOS applications rely on x86 16 bit support and real-mode, and this is very difficult to manage that in current x86 processors. Also, by running natively, you can't run your program in other CPU architectures.
I think something like WiBo might be able to run cc1psx.exe
, but as you pointed out, this approach will only work on x86's, at least initially. (And of course, someone will need to expend the effort to program such a thing.)
I think what a WiBo-like wrapper will need to care about, is what services the actual 32-bit COFF program — the thing that appears after the DOS extender stub in the program binary — really needs. The DOS extender used seems to be go32
v1.x, so I guess the task will be to figure out the go32
ABI that is implemented.
Thank you!
Hello @dmsc,
I have a branch that tries to add 80286 protected mode support, and I could not make it work with many programs, as 286 protected mode is not that documented.
I am curious — what problems did you encounter in particular?
(And, I do not quite believe that 80286 protected mode is "not that documented". Intel's Software Developer's Manual is still very much around. And the source code listings of IBM's PC AT BIOS, including its protected mode services, are available. But one needs to read these really closely.)
Thank you!
We're looking for a lightweight alternative to dosemu2 for running some old dos-based compilers, however instantly hit a hurdle with emu2:
I can see from the code that this is explicit behaviour:
.. is this because it's a significant amount of work to implement? where would one even start if I wanted to try? or should I throw in the towel now :)