taviso / 123elf

A native port of Lotus 1-2-3 to Linux.
1.17k stars 59 forks source link

Cleaner and more verbose extract.sh; drop bash dependency #67

Closed vrza closed 2 years ago

vrza commented 2 years ago

Tidying up this script a bit as well.

taviso commented 2 years ago

Looks good to me, what do you think about using set -e?

vrza commented 2 years ago

I prefer explicit error handling and almost never use set -e in scripts.

In this particular case, while set -e would exit the script early, it would still not perform clean up (removing root/ and orig/) which might be necessary in order to re-run the script.

We could add a cleanup function, either to be invoked by the user manually (extract clean), or we can run it and exit on any error.

While in theory cleanup could be triggered by using set -e with trap ERR, in practice it's fiddly, as some commands might exit a non-zero status during normal script operation.

https://mywiki.wooledge.org/BashFAQ/105

Somewhat related thoughts, for your consideration:

vrza commented 2 years ago

Made the script more portable, #62 is now partially fixed, but libarchive cpio aka bsdcpio failing to extract 123UNIX4.IMG is something that still needs looking into.

vrza commented 2 years ago

Fully fixed #62.

vrza commented 2 years ago

Begginings of123UNIX2.IMG and 123UNIX4.IMG seem to be the same:

$ cmp 123UNIX2.IMG 123UNIX4.IMG
123UNIX2.IMG 123UNIX4.IMG differ: byte 36865, line 489 

This is where in 123UNIX4.IMG, 123_sysV.hlp.z seems to be corrupt (gunzip detects corruption in compressed data).

123.o.z_2 is found deep in 123UNIX4.IMG, 550kB in, in cpio format. GNU cpio is able to skip to this file automatically, but BSD (libarchive) cpio is not.

It's hard to say how and when, but it almost seems like 123UNIX4.IMG was at some point partially overwritten with data from 123UNIX2.IMG.

taviso commented 2 years ago

Thanks this looks good to me - I'm tempted to wait to merge until the keymap tool is done #65 (hopefully tomorrow), because right now it just uses xterm for everything. I think I can just pull all the right sequences from termcap and it will be a pretty good default.

taviso commented 2 years ago

Never mind, I changed my mind, let's just commit it for now.

dtwilliamson commented 2 years ago

I recommend a habit of avoiding all-caps variable names in shell scripts to reduce the chance of collisions with shell and environment variable names.

http://mywiki.wooledge.org/BashGuide/InputAndOutput#The_Environment

vrza commented 2 years ago

We should definitely be careful not to clobber any exported variables, lest they be exported to our child processes inadvertently modified.

Regarding naming style, I try to stick to an old convention of using ALL_CAPS for constants (vars not modified after assignment) and exported vars. As a reference, Google shell scripting style guide also recommends this:

https://google.github.io/styleguide/shellguide.html#s7.3-constants-and-environment-variable-names

taviso commented 2 years ago

I'm hearing about issues with the new extract script - I'm tempted to revert it for now until I can test it more thoroughly.

(I just heard it hangs under WSL2, I don't really know why)

taviso commented 2 years ago

I figured out the problem, it wasn't hanging, disk i/o on wsl2 is just ridiculously slow - so the dd seeking with bs=1 was just taking forever.

I'll re-merge this tomorrow.

vrza commented 2 years ago

@taviso Try inverting the parameters (skip=1 bs=550536) to see if that helps improve performance?

The root cause of why 123UNIX4.IMG seems to be corrupt still needs investigating -- dd seek is a workaround for data corruption. Finding a pristine copy of that floppy disk image would be nice.

In the meantime, you could also just upload the trimmed version of 123UNIX4.IMG, so that the script doesn' have to skip corrupt data.

taviso commented 2 years ago

Yeah, It's really unfortunate - the original files I recovered were TD0 files, so I thought maybe the tool I used to recover the images were buggy - but I've tried three independent implementations and they all produce the same file.

I guess it's possible this is some form of copy protection, I know that the 123 DOS and Agenda disk images had some unusual properties that made them difficult to copy.