Closed vrza closed 2 years ago
Looks good to me, what do you think about using set -e?
I prefer explicit error handling and almost never use set -e
in scripts.
In this particular case, while set -e
would exit the script early, it would still not perform clean up (removing root/
and orig/
) which might be necessary in order to re-run the script.
We could add a cleanup function, either to be invoked by the user manually (extract clean
), or we can run it and exit on any error.
While in theory cleanup could be triggered by using set -e
with trap ERR
, in practice it's fiddly, as some commands might exit a non-zero status during normal script operation.
https://mywiki.wooledge.org/BashFAQ/105
Somewhat related thoughts, for your consideration:
.sh
extension for shell scripts if they are meant to be used as a library, by invoking . lib.sh
from another scriptMade the script more portable, #62 is now partially fixed, but libarchive cpio aka bsdcpio failing to extract 123UNIX4.IMG is something that still needs looking into.
Fully fixed #62.
Begginings of123UNIX2.IMG
and 123UNIX4.IMG
seem to be the same:
$ cmp 123UNIX2.IMG 123UNIX4.IMG
123UNIX2.IMG 123UNIX4.IMG differ: byte 36865, line 489
This is where in 123UNIX4.IMG
, 123_sysV.hlp.z
seems to be corrupt (gunzip detects corruption in compressed data).
123.o.z_2
is found deep in 123UNIX4.IMG
, 550kB in, in cpio format. GNU cpio is able to skip to this file automatically, but BSD (libarchive) cpio is not.
It's hard to say how and when, but it almost seems like 123UNIX4.IMG
was at some point partially overwritten with data from 123UNIX2.IMG
.
Thanks this looks good to me - I'm tempted to wait to merge until the keymap tool is done #65 (hopefully tomorrow), because right now it just uses xterm for everything. I think I can just pull all the right sequences from termcap and it will be a pretty good default.
Never mind, I changed my mind, let's just commit it for now.
I recommend a habit of avoiding all-caps variable names in shell scripts to reduce the chance of collisions with shell and environment variable names.
http://mywiki.wooledge.org/BashGuide/InputAndOutput#The_Environment
We should definitely be careful not to clobber any exported variables, lest they be exported to our child processes inadvertently modified.
Regarding naming style, I try to stick to an old convention of using ALL_CAPS for constants (vars not modified after assignment) and exported vars. As a reference, Google shell scripting style guide also recommends this:
https://google.github.io/styleguide/shellguide.html#s7.3-constants-and-environment-variable-names
I'm hearing about issues with the new extract script - I'm tempted to revert it for now until I can test it more thoroughly.
(I just heard it hangs under WSL2, I don't really know why)
I figured out the problem, it wasn't hanging, disk i/o on wsl2 is just ridiculously slow - so the dd seeking with bs=1 was just taking forever.
I'll re-merge this tomorrow.
@taviso Try inverting the parameters (skip=1 bs=550536
) to see if that helps improve performance?
The root cause of why 123UNIX4.IMG
seems to be corrupt still needs investigating -- dd
seek is a workaround for data corruption. Finding a pristine copy of that floppy disk image would be nice.
In the meantime, you could also just upload the trimmed version of 123UNIX4.IMG
, so that the script doesn' have to skip corrupt data.
Yeah, It's really unfortunate - the original files I recovered were TD0 files, so I thought maybe the tool I used to recover the images were buggy - but I've tried three independent implementations and they all produce the same file.
I guess it's possible this is some form of copy protection, I know that the 123 DOS and Agenda disk images had some unusual properties that made them difficult to copy.
Tidying up this script a bit as well.