joncampbell123 / dosbox-x

DOSBox-X fork of the DOSBox project
GNU General Public License v2.0
2.68k stars 381 forks source link

Future "next generation" DOSBox-X (idea dump and organization) #1184

Open joncampbell123 opened 5 years ago

joncampbell123 commented 5 years ago

I've decided to open an issue about a possible future rewrite of DOSBox-X for ideas, designs, programming practices, anything can be collected together into a feature and design list. Of course not all ideas here will be implemented, but it helps to write them anyway.

I'll start:

I have a lot of ideas, may post more. Not all will be implemented but it helps to list them. Development may not start for a few years.

joncampbell123 commented 5 years ago

Development process ideas:

joncampbell123 commented 5 years ago

God-like control layer:

joncampbell123 commented 5 years ago

Save state support (of course). Exactly what will be saved should be clarified. For example, files or disk images might be difficult to save state and therefore infeasible if, for example, running a disk image of Windows 95 that provides it's own disk and filesystem drivers for itself.

Periodic snapshotting of state, if the user desires the ability to rewind as they can now in other popular emulators. This is where the "dirty page" marking idea comes into play: Only dirty pages need be saved in the snapshot so that rewind need only put the changed pages back.

joncampbell123 commented 5 years ago

Debugger assembler function. Not only should it decompile to show instructions, it should provide a means to type in new assembly language at a memory location as well (such as "MOV EAX,[ESI]"). Might be useful during development for live patching the executable in memory instead of recompiling and running again, or for first time programmers learning x86 assembly.

joncampbell123 commented 5 years ago

Assembly in the guest of BIOS and DOS kernel code from compiled OBJ files (generated by 16-bit development tools like NASM, Open Watcom C, etc.).

Instead of hard-coding bits of machine language as DOSBox SVN and DOSBox-X do now, use 16-bit development tools to write it in C or ASM that compile to OBJ (OMF) format. Then when DOSBox-X assembles the BIOS and DOS kernels, it links together the OBJ files as instructed. Variations in linking and OBJ files can be made for machine type, DOS versions and personalities, etc. OBJ files have symbols as well, which can be presented in the debugger.

joncampbell123 commented 5 years ago

DOS personality selection. The DOS kernel on startup can arrange itself depending on a specific version of DOS it's meant to emulate.

Possible versions in order of likeliness:

joncampbell123 commented 5 years ago

MS-DOS device driver emulation:

joncampbell123 commented 5 years ago

Other ideas provided by users here:

joncampbell123 commented 5 years ago

Keyboard input:

i30817 commented 5 years ago

make multiple 'directory' drives mounted by mount as a cd rom possible - this minimizes wasted space on cdroms images and their many many redundant or error correction bytes. Most often individual cds can be mounted like this, but then the game wants to change them and use the same driver; and sometimes the trick of merging all 'cds' into a single dir doesn't work, which is a major waste on late DOS games like RAMA and others.

Another project should really get going on windows 95/98/XP emulation that doesn't use actual windows, like a combination of wine+dosbox, to get native directory mounting passthrough. Copy on write becomes more important and more difficult to use (in the OS) as the disk image increases and dosbox support for disk images doesn't have one with copy on write support (well, there is a qcow2 patch, but using it for large images is suicide because it's implemented as a total copy to memory and overwrite. which doesn't work with 5gb plus images, to say the least).

StrikerX3 commented 5 years ago

x86 kvm/HyperV core, for use with anything past about Windows 95 to run at a usable speed. Imagine being able to run Windows ME at a usable speed in DOSBox-X.

You might want to consider virt86 for virtualization, which supports KVM, HAXM and Windows Hypervisor Platform.

joncampbell123 commented 5 years ago

@i30817 The best way to implement the Windows API without Windows is to start with the oldest version. Windows 1.0 might be a good start, where that is all 16-bit real-mode stuff.

The best way to envision Windows 1.0 through Windows ME is code that builds on itself every version, either adding new code or wrapping existing code.

Windows 3.0 and higher could be thought of as first wrapping the 1.0/2.0 code from real mode under a protected mode kernel (and DPMI). Windows 3.1 "386 enhanced" for example is the 16-bit userland (now 16-bit protected mode) under a 32-bit kernel that "virtualizes" everything to make VMs get along. Windows 95 and later builds out the 32-bit kernel while minimizing the 16-bit part over time. Under all this is calls down to DOS.

There's a lot about Win16 that WINE has helped me understand including many key details old Microsoft MSDN documentation left out, such as how relocations are actually processed through a segment. I can provide an ISO of old MSDN CDs I have (from the early to mid 90s) if it helps.

joncampbell123 commented 5 years ago

@StrikerX3 I may have to borrow the virt86 code and adapt it, still using GCC 4.8 at the current time, unless GCC 5 to 7 become the new minimum in the future development. If that happens, then the future minimum C++ standard may have to be C++14 or C++17 instead of the current DOSBox-X C++11.

emendelson commented 5 years ago

Just a hope that you might implement scalable TrueType support (as in vDos) at some point - maybe even before the next generation comes along... Just a hope!

joncampbell123 commented 5 years ago

@emendelson Actually, yes, I have that in mind. The future version could ship with an open source TTF font like that Ubuntu font out there. You could have a nice high resolution DOS prompt that way. PC-98 mode could benefit from it to help make all the kanji more readable too, and no worries about custom bitmaps because most of the charmaps on PC-98 are fixed in ROM.

joncampbell123 commented 5 years ago

CGA/MDA are easy enough, the charmaps are in ROM, while EGA/VGA will need to check whether the font bitmap has been altered by the guest to decide between TTF font rendering and raster rendering.

joncampbell123 commented 5 years ago

Another idea: DOS utility wrapper mode.

emendelson commented 5 years ago

This is excellent news. If there's any hope that you might be able to implement this in the current code, that would be really terrific. As monitors get larger and larger, DOSBox's bitmap fonts look worse and worse...

joncampbell123 commented 5 years ago

Future DOSBox BIOS logo:

joncampbell123 commented 5 years ago

@emendelson This is great news, DOSBox-X has accumulated a lot of "off the cuff" development and a "reset" of the code to build a fresh architecture is a great way to clean it up and give it a new design.

However please understand the intent is to collect ideas here to form a specification, development may not happen for a few years, not only because the redesign needs to be carefully thought out, but also because of my professional work which does take a good portion of my time. I am going to need help with this, and the best way for others to help is to provide a reasonably solid spec on what needs to come together and develop.

emendelson commented 5 years ago

Understood!

phire commented 5 years ago

CPU core with two threads: one acts as the x86 "prefetch and decode" stage and the other acts on the tokens generated by the first thread. Most systems today are at least dual core, even embedded systems like the Raspberry Pi.

The idea is cool, but my gut says it wouldn't result in worthwhile preformance. The cost of synchronizing between both CPU cores is just too expensive, especially on ARM, and would negate any preformance gains.

I have been pondering if a similar approach with a decoder and an uop interpreter interleaved on the same thread. Something like: loop over the decoder for 8 instructions then loop over the uop interpreter for 8 instructions.

Would eliminate any cross core synchronisation issues and on out-of-order CPUs this approach should result in parallel decoding of instructions. Still, I somewhat doubt it would lead to improved preformance over a traditional interpreter.

joncampbell123 commented 5 years ago

@phire Would it provide any worthwhile performance boost if the prefetch thread were to recognize common combinations of instructions and provide a token for that common combination?

joncampbell123 commented 5 years ago

@phire Perhaps the uop interpreter in the same thread might be able to do the same, common combinations of instructions?

phire commented 5 years ago

Theoretically, if you could synchronise the decode and execute threads across two cores without any performance penalty, then that would be a worthwhile optimisation. But you can't, whenever two cores are reading/writing the same cache lines, you hit huge latency penalties of around 200 cycles (on Coffee Lake) whenever a dirty cache line needs to be shuffled from one core to another.

That's way longer than a branch miss-prediction and comparable with an L3 cache miss.
If you want to multi-thread things, you want to pick workloads that can operate independently for long stretches of time before needing synchronisation.


I think the second you start considering such optimisations, you should just start work on some form of dynamic recompiler or jit.

Even the simplest form of dynamic recompiler where you just barf a series of function calls into a buffer should be worthwhile performance wise.

A sequence of instructions like this:

mov ah, 09h
lea dx, 200h
int 21h
mov ax, 4C00h
int 21h

could be translated as:

call    mov_imm_reg8
call    lea_simple
call    int21
call    mov_imm_reg16
call    int21
ret

The functions you call would be responsible for decoding the rest of the instruction, extracting the registers, addresses and immediates before executing the operation.

Most modern CPUs are good at predicting calls/returns thanks to a call stack, though there are many CPUs out there which I suspect would benefit from stuffing nops between calls so there is no more than two calls per 16 bytes.

Such a recompiler would be easy to port, you only need to implement basic call and return instructions for each architecture.

You could even apply decoding based optimisations to this scheme, like fusing common pairs of instructions and following branches. For emulating x86, the ability to pre-decode the different types of modrm byte would be very useful. Though there is probably an art to picking function complexity to avoid blowing out the instruction cache.
Because you will cache these sequence of calls for multiple executions, your decoding code runs rarely and can be somewhat more complex.

The one disadvantage of this approach over a traditional is you need to correctly handle self-modifying code and invalidate the cache of instruction traces. But I'll point out your split-core interpreter idea also has issues handling self modifying code.


Once you have this style of subroutine threaded interpreter, you have a good base to work towards a full recompiler. You can slowly replace the calls to interpreter subroutines with proper recompiled code based on bottlenecks.

joncampbell123 commented 5 years ago

@phire Then I will follow your suggestion instead. If synchronizing across CPU cores has that much of a performance penalty then I'll cross that off the list.

One concern is the need to periodically interrupt the CPU core for cases where ISA DMA or PCI bus mastering conflicts with the CPU. When it doesn't matter, DMA and bus mastering could occur during the 1ms timer tick handling, but when the memory conflicts with the CPU (or at least the same page) it would be necessary to interrupt the CPU core on time to emulate the transfer. As commonly referenced, the Sound Blaster "goldplay" trick is one example where ISA DMA must be accurate to what the CPU has last written to the same memory region, or else it doesn't work. This is what the specific chained memory handler idea is about: ISA DMA can insert a read/write handler that triggers processing DMA up to that time if that page of memory is touched.

Tetsuya00X commented 5 years ago

Great news from this post. I love this version of DOSBOX. Saldy I'm not a programmer, thus I don't understand most of your new techinal improvements. If I may say you a couple of ideas:

1.- Is it posible remade keyboard mapper editor for change configuration on the fly? (inside dosbox, using command in DOS). None of one of dosbox versions that I found have this feature.

2.- This is a personal DOS-lover idea. Is it posible make a special command to open dosbox config inside the program. I don't know, a version of reset and "enter to bios" or ms-dos config menu.

Maybe they are stupids ideas, but it's free to say it hahaha. Thanks for your work

joncampbell123 commented 5 years ago

@Tetsuya00X

  1. Do you mean built-in shell commands to change the mapper?

  2. There is a graphical configuration editor inherited from DOSBox DAUM that can be brought up from the menu or by typing SHOWGUI on the command line. The new version would have a better UI framework than the GUITK crap currently in the code though.

joncampbell123 commented 5 years ago

@Tetsuya00X I do like the idea of a configuration editor though that resembles a classic BIOS!

Tetsuya00X commented 5 years ago

@joncampbell123

  1. Do you mean built-in shell commands to change the mapper?

Exactly.

I do like the idea of a configuration editor though that resembles a classic BIOS!

Old blue configuration screen will be wonderful

My thoughs is If I want to use a DOS machine emulator, It's logical for me using boring longs commands to change configuration, no using the abomination of MOUSE! (joke) (I don't like using frontends).

joncampbell123 commented 5 years ago

@Tetsuya00X Configuration screen ideas:

IBM PC emulation: Blue background, AMI BIOS type two vertical partitions. One on the left is the settings, the other on the right is the help.

Alternative for IBM PC emulation: Resembles the "Graphical" BIOS configuration menu seen on mid 1990s 486 motherboards with mouse support. User can choose which one.

PC-98 emulation: Black background, two vertical partitions with border and white text resembling the BIOS configuration screen on PC-98 systems.

FM towns: Not sure... whatever their BIOSes look like. Have yet to get one.

joncampbell123 commented 5 years ago

By "graphical" I mean this BIOS:

winbios

Tetsuya00X commented 5 years ago

@Tetsuya00X Configuration screen ideas:

IBM PC emulation: Blue background, AMI BIOS type two vertical partitions. One on the left is the settings, the other on the right is the help.

That's the bios I can remember (from my 486DX)

Alternative for IBM PC emulation: Resembles the "Graphical" BIOS configuration menu seen on mid 1990s 486 motherboards with mouse support. User can choose which one.

For me, this was a OS Bios (probably I'm so wrong). But if will be choose for what bios can use, it'll be perfect.

Also, the idea of bios-like DOSBOX config menu add the posibility to add a section for change DOS selection (the idea you said above) or kind of machine that want to emulate, without using change dosbox.conf configuration. Maybe make a section in the principal tab that show hour, virtual hdd size (nostalgic stuff) and machine model / dos version (or other OS)

joncampbell123 commented 5 years ago

It's a normal BIOS, except that it has a graphical setup menu, even if they call it "WinBIOS". In the same way there was a 486 (or Pentium?) clone called the WinChip that could run MS-DOS just fine.

It's possible they only called it WinBIOS because it's graphical, not because there's Windows in the BIOS :)

joncampbell123 commented 5 years ago

The intent is to have a overall "emulator management mode" that sits atop the guest and completely control it, including shutting it down and full reinitialization of the emulation without restarting DOSBox-X when you change system type, as well as support for user-loadable code that can control the guest for whatever reason (patching, debugging, tool assisted speedruns, etc.)

Tetsuya00X commented 5 years ago

I don't knew it. My first pc was a 486DX with normal dos-like amibios. My second pc was a pentium III (same bios). Only this year (with my i7-8700k) I can use mouse in bios

Tetsuya00X commented 5 years ago

The intent is to have a overall "emulator management mode" that sits atop the guest and completely control it, including shutting it down and full reinitialization of the emulation without restarting DOSBox-X when you change system type, as well as support for user-loadable code that can control the guest for whatever reason (patching, debugging, tool assisted speedruns, etc.)

That's was my principal idea. I just imagined menu layout. I really love the idea

phire commented 5 years ago

I see you have some nice documentation on the topic.

The way I would go about timing is to design the CPU thread so it always knows the minimum number of cycles before it might need to be interrupted. There should never be an external thread injecting interrupts into the CPU thread, as that plays hell with determinism.

On any writes to hardware, the CPU thread should be doing the minimum amount of work necessary to calculate how many cycles out any interrupts might be, before handing off any hard work to external threads. If the exact number of cycles can't be calculated without doing the actual work, a lower bound can be estimated instead and the CPU thread can poll the thread later for the exact interrupt cycle.

This might lead to a system where you dynamically decide to do work on the CPU thread or an external worker thread depending on how long the work might take. With the long cross-thread delays, there is no point in sending it to another thread if you need the results back on the CPU thread within a few thousand cycles.

bastetfurry commented 5 years ago

Haven't read the whole thread so sorry if it is already in there, but I would fancy a memory freezer. Think Action Replay, GameWizard, ArtMoney and friends but build into the emulator. It should also be able to search for values to freeze, of course.

joncampbell123 commented 5 years ago

@bastetfurry I've heard that Action Replay and Game Genie intercept reads (writes?) to certain addresses and replace the byte value if they match, correct?

A memory handler linked list design would allow for that, I think.

That's easy enough for 16-bit real mode programs, and possibly 32-bit flat protected mode. I don't think such as system would be directly possible from the emulator (at least without guest addition "drivers") if you want to do that in, say, Windows 3.1 or Windows 95.

What do you anticipate the interface should look like?

EDIT: I want to point out that unlike the NES and SNES, code and data are not at fixed addresses in memory. However in most simple DOS games variables are normally at fixed memory addresses relative to the base segment the EXE was loaded into memory. The "Action Replay" interface will need to consider that. It may looks something like "JILL.EXE+22033h = 44h" to say that byte 22033h relative to the PSP segment base of JILL.EXE should be frozen to value 44h.

EDIT: Another potential problem of course will be self-decompressing/decrypting code on startup, including EXEs compressed with Microsoft's EXEPACK compression, or PKZip compressed EXEs.

joncampbell123 commented 5 years ago

@phire That gets complicated when the CPU issues I/O that triggers another interrupt (such as sending an EOI to the PIC to acknowledge an interrupt). It might be something to try. It could be an optional core that the user could use if they care more for performance than accuracy and are willing to accept reduced interrupt and DMA precision to accomplish it.

Sort of like in the current DOSBox-X where the dynamic core provides better performance at the cost of accuracy, or even the ability to single-step instruction by instruction in the debugger.

bastetfurry commented 5 years ago

@bastetfurry I've heard that Action Replay and Game Genie intercept reads (writes?) to certain addresses and replace the byte value if they match, correct?

Yes, these work like that or they simply force a value to be set in stone. Sometimes they even patch the code so that a DEC $lives gets turned into a bunch of NOPs.

The working ones for DOS tough needed to be trained to a game every time the game is started, tough. So you start, for example, Sim City, start a new game, tell your cheat tool to scan the memory for 10000, it will create a list of all addresses that have said value, go back to the game and buy something, search which memory addresses changed their value to the new one, rinse and repeat until you have your address where the ingame money is stored.

joncampbell123 commented 5 years ago

Floppy, hard disk, and CD-ROM emulation:

joncampbell123 commented 5 years ago

CD-ROM emulation could provide (if available from the CD image) the full 2352 bytes/sector, the various modes available when reading. If only an ISO is provided, the ECC data around the data sector could be synthesized on the fly. Raw reading could provide random "bit flips" to emulate the low error rate of CD-ROM which could trigger firmware retry, or if read raw with "C2 correction data", provide the bit flips and bit fields indicating sector errors.

Alllow CD-ROM emulation to report different optical media types. User could mount an ISO with a flag indicating that the ISO represents a CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-R, DVD-RW, etc. and emulation would behave as such.

Allow CD-ROM to mount an empty CD-R/CD-RW image by filename, where the file does not exist, but if the guest OS runs a CD burning application, the CD "burn" creates the file and TOC indicated. CD-RW would allow rewriting the image.

joncampbell123 commented 5 years ago

CD-ROM could provide emulation of "multi-session" CD/CD-R/CD-RW/DVD-R discs, perhaps as an extension of the BIN/CUE format already supported by DOSBox and DOSBox-X.

joncampbell123 commented 5 years ago

DVD-ROM emulation could include emulation of the CSS authentication protocol (DVD video). If requested, it could provide fake CSS encryption on the fly for the guest to decrypt. An extension to BIN/CUE could be provided to provide a per-sector map of the CMI byte to direct that as well.

joncampbell123 commented 5 years ago

Perhaps CD-ROM emulation could provide non-IDE emulation, including SCSI and the proprietary CD-ROM controllers that were once common on early 1990s sound cards (Panasonic, Mitsumi, Sony, etc.) I have no documentation on those interfaces at this time.

joncampbell123 commented 5 years ago

PC-98 video emulation should be rewritten to better emulate how the master and slave GDCs interact, including the scrambled video display that can occur if the graphics (slave) GDC is given different video timing than the text (master) GDC as witnessed on real PC-9821 hardware when running a game that was unaware of the 31khz VGA compatible display mode, yet reprogrammed timings. Other games are known to shorten the active display area, and some change the blanking intervals to reposition graphics relative to text (Dragon Buster).

joncampbell123 commented 5 years ago

Filesystem I/O:

Disk support:

All disk image and filesystem drivers should have an API so that the user at any time can look at internal state, examine the disk, filesystem structure, for debugging purposes. Providing a built-in tool on drive Z: or builtinfs for that is fine. Perhaps for fun, an API could be provided for Lua/Squirrel scripts to talk to the FAT filesystem driver for hacking purposes.

Built-in filesystem tools:

joncampbell123 commented 5 years ago

Built-in tools to examine, use, and modify the partition table of disk images.

Someone installing Windows 95 or setting up a guest VM could use these tools to set up the partition table the way they want, even in ways the guest OS's tools do not allow.

EDIT: This is important as well for PC-98 emulation setups because the partition table for PC-98 is entirely different from IBM PC partition tables, and the user may want to examine what is there and possibly modify or correct the table.