Open vadosnaprimer opened 2 years ago
Yes, this is a good idea. I've thought a bit about Mesa based rendering. LLVMPipe may be too advanced to support in the waterbox, but Mesa has other choices as well. I was always worried about speed; if it's glacial, then is it worth it?
I'd love to see speed as the only problem for some things that are otherwise outright impossible to get into hawk.
From a tas perspective, speed is of no concern as we tend to slow the game down anyways.
If it was like 1 frame every minute then it would be impractical.
If we're talking about arbitrary 3d cores like dolphin or citra, the speed very well may be that low. We can't assume anything about orders of magnitude here without research.
Both Dolphin and Citra have software renderers anyways. They would probably be faster than Mesa based rendering (assuming said renderers are optimized, which Dolphin definitely isn't anyways)
Ruffle (as mentioned) is the main thing that would greatly benefit from something like this, given it has no savestates but it doesn't have any true software renderer (although there is a separate question of can Rust code be waterboxed, but I imagine that is probably a yes with some work).
Rust is in principle waterboxable. How fast is regular ruffle with mesa?
Ruffle's speed is game dependent, but in libTAS with forced software rendering, I get about 0.5x-1x speed with Meat Boy and Super Mario Flash can get up to around 2x speed. Both Vulkan and GL run at around the same speed. This is not on a great computer either; I have Ubuntu running natively on an old laptop, it has these specs:
CPU: Intel® Core™ i5-4210M CPU @ 2.60GHz × 4 GPU: Intel® HD Graphics 4600 (HSW GT2)
LLVMpipe could be used for BlastEm core if you don't want to disable OpenGL?
Something to consider here (from LLVMpipe's page):
Also, the driver is multithreaded to take advantage of multiple CPU cores (up to 32 at this time).
Threads are junk in the box, so whatever speed measures you end up finding with libTAS + forced software rendering is likely going to be way better than what the box will give you.
As done in the melonDS core, you can waterbox code interacting with OpenGL (possibly Vulkan too, but I haven't looked deep enough into it). It does require some very liberal use of ECL_INVISIBLE and some magic to handle the runtime ABI differences (the core is sysv abi but GL function pointers may be ms abi or sysv abi; also yes this is meaning the evil of using function pointers directly instead of having standard callback wrapping is done so some minor potential compromise to determinism since no stack marshalling, this is more needed due to the amount of GL function pointers). In case there are readbacks, expect the obvious GPU specific determinism, and maybe some potential non-determinism with savestates (it is possible to have readbacks for Flash, so Ruffle's case will suffer from there), but if there are no readbacks, this should generally be safe for determinism.
Also, LLVMpipe would internally use a JIT here. On principle a JIT can work in the box, but it generally needs modifications to work in the box at all (as some techniques will not work) and generally will result in a ballooning of state size without modifications. Some JIT designs might not be usable at all within the box here.
~If we want JIT-in-the-box, it would be best to start with the simplest smallest JIT-capable core or lib we can find that does something useful. That would let us prove the potential value of the method before going too deep on something that's not clear to provide benefit.~ Edit: Never mind, guess I don't follow this much anymore, didn't know we already had working JIT.
@zeromus what do you think?
JIT-in-the-box already exists in ares64. Again, it really depends on the JIT design. If the JIT is just using a small fixed preallocated pool of memory (not just growing arbitrarily), and is not invoking any fastmem tricks (probably not relevant for LLVMpipe anyways, just other emulator JITs), then a JIT could work. "Small" being whatever you're comfortable putting into a savestate (it's 8MiBs total for ares64, 4MiBs split against the 2 JITs), unless I guess if you're comfortable making it invisible, then it could be rather large (JIT invalidation might not work without desyncs for some JITs).
Since @nattthebear is non-trivial to catch on irc while I remember about this, I guess we need a ticket.
libTAS uses LLVMpipe to simulate hardware graphics stuff in software and make savestates of that.
We know that waterbox can't snapshot GPU state, but would it be feasible to incorporate llvmpipe into waterboxhost so cores that require GPU (like Ruffle) could be ported?