MatthewCroughan / NixThePlanet

Run macOS, Windows and more via a single Nix command, or simple nixosModules
MIT License
534 stars 12 forks source link

Implement makeWin98Image #9

Closed io12 closed 8 months ago

io12 commented 8 months ago

I tested this only on x86_64-linux. Sorry if some things weren't done the best way.

MatthewCroughan commented 8 months ago

You should add a repeatability test, since it will prove it works 100 times like the rest, I'm pretty impressed though.

io12 commented 8 months ago

You should add a repeatability test, since it will prove it works 100 times like the rest, I'm pretty impressed though.

Done. Do I need to run it myself or does the Hercules CI do it?

MatthewCroughan commented 8 months ago

@io12 I have to bring your PR into the repo as a branch, which I just did, now the CI will start running.

https://hercules-ci.com/github/MatthewCroughan/NixThePlanet/jobs/351

io12 commented 8 months ago

It seems to not be repeatable unfortunately due to keypresses dropping. I'll try to fix it.

MatthewCroughan commented 8 months ago

@io12 Yeah, one of the todos is perhaps to try xdotool to get the keypresses to stop dropping, I've found vncdo to be terrible.

io12 commented 8 months ago

Okay, I think it's more reliable now. Can you pull from the branch to run the CI again?

MatthewCroughan commented 8 months ago

Looks like it has worked 21 times so far, waiting for the 100, and then we can merge and document it :)

https://hercules-ci.com/accounts/github/MatthewCroughan/derivations/%2Fnix%2Fstore%2F05m71z705izxvwc4xg6sz6sxkxbsy89s-win98.img-linkfarm-100.drv?via-job=44e50687-782f-42f5-aadc-95bb9f7274da

MatthewCroughan commented 8 months ago

@io12 It appears that 10 of the instances have gotten stuck at the following step

image

Whereas two others are stuck at this without continuing image

MatthewCroughan commented 8 months ago

Windows/msdos releases in the repo that I've tested, only 1 in 1000 would fail due to what seems like runtime bugs with dosbox, which is fine. But the other failed states can be worked around. I'm happy with 1 in 1000 failure rate, maybe even 1 in 100 is good enough for the purposes of the repo.

io12 commented 8 months ago

Kind of frustrating that it's stuck at the final step 🙃. I'm not sure whether it's not fully booted yet and isn't ready to open the start menu or some other issue. Also, how did you get the screenshots?

MatthewCroughan commented 8 months ago

Nix does its work in /tmp, so I just go in there and extract the screenshots, same way as the macos repeatability test defined in this repo works.

io12 commented 8 months ago

I made more repeatability improvements. Can you run the CI again?

MatthewCroughan commented 8 months ago

@io12 some fail like this, maybe dosbox's fault?

image

Some seem to still get stuck at the end like this

image

MatthewCroughan commented 8 months ago

@io12 Statistically, your change did have an impact, as now at least 68 in 100 attempts work, as opposed to the previous 22.

MatthewCroughan commented 8 months ago

Maybe we could just kill -9 the process when it gets to the start menu instead of trying to shut down properly?

io12 commented 8 months ago

Maybe we could just kill -9 the process when it gets to the start menu instead of trying to shut down properly?

That would work but then there's a ScanDisk improper shutdown message on the next boot. I think the issue is sometimes when the start menu is open before the "welcome to windows" window, the start menu can close. I think maybe the failing cases try to select the shutdown option after this happens. I'm trying to fix this by repeating the ctrl-esc u enter combination until the desktop icons and start menu disappear.

MatthewCroughan commented 8 months ago

@io12 rarely, this failure happens

image

io12 commented 8 months ago

Okay, I think I fixed both issues and it repeats locally 16 times. Hopefully it works all 100 times now.

MatthewCroughan commented 8 months ago

@io12

Wow, the failure rate now seems to be 3 in 100.

The three that have failed, have failed in a strange way though, as they have gotten to the ### OMG DID IT WORK???!!!! ### log line. Here are their logs for you to inspect.

There are the last three screenshots that were taken by the expect script, but the screnshots are no longer being taken by the process, as their last modified time is no longer updating periodically.

image image image

io12 commented 8 months ago

Thanks! Such a rare issue is pretty tricky to debug, but I removed a code path left over from a different approach that triggers in all three and doesn't trigger normally. Even if it doesn't fix the issue it should at least give more up-to-date screenshots.

io12 commented 8 months ago

Also, I wonder how hard it would be to patch dosbox-x to fix the non-repeatability issues. If the issue comes from the timing of keypresses, maybe we can make dosbox-x take an input file that automatically triggers keypresses at specified cycle counts? I'm not sure if this would actually make things fully deterministic, but maybe worth a shot? And I don't think this is the first time Nix packages have patched build tools to make them more deterministic?

MatthewCroughan commented 8 months ago

1 in 100 seems end up like this LOL!

image

Whereas there are one other stuck like this

image

MatthewCroughan commented 8 months ago

@io12 Have you got a matrix or discord we can talk more actively on?