Projeto-Pindorama / heirloom-ng

A collection of standard Unix utilities that is intended to provide maximum compatibility with traditional Unix while incorporating additional features necessary today.
http://heirloom-ng.pindorama.dob.jp
Other
24 stars 7 forks source link

tar(1): lzma, lzop, zstd support; fix infinite loop #10

Closed tch69 closed 1 year ago

tch69 commented 1 year ago

When tar couldn't find the appropriate command to decompress (e.g, xz for a .tar.xz tarball), the program is told to re-execute it over and over again, leading to a infinite loop.

However you have to press Enter to return to shell; which I don't know how to work around that

tch69 commented 1 year ago

EDIT. This breaks gzip and xz support

takusuman commented 1 year ago

When tar couldn't find the appropriate command to decompress (e.g, xz for a .tar.xz tarball), the program is told to re-execute it over and over again, leading to a infinite loop.

However you have to press Enter to return to shell; which I don't know how to work around that

I think it could just print a error message (something like "error: xz not found"), and then exit with an error code. What you think about?

EDIT: If it's not inconvenient, how does it break? Could you figure out why? Maybe we can fix it.

tch69 commented 1 year ago

There must be something wrong with the tarball detection part. I tried patching the infinite loop part only and apparently everything worked fine.

For the latter problem, turns out that I didn't have to actually press Enter at all. Somehow the prompt was just not showing up. That's even stranger. Sadly I only have little knowledge of C; hopefully you can fix it.

takusuman commented 1 year ago

There must be something wrong with the hardware the detection part. I tried patching the infinite loop part only and apparently everything worked fine.

For the latter problem, turns out that I didn't have to actually press Enter at all. Somehow the prompt was just not showing up. That's even stranger. Sadly I only have little knowledge of C; hopefully you can fix it.

Well, I will be compiling your branch here and see if this isn't a host-specific problem.

I will also let this pull request open; if I get any positive results, I will be using it for reporting.

takusuman commented 1 year ago

Strange. When I try to open a xz file now, it returns "tar: cannot read archive". Seeing what again was before we switched it for done(1), it seems like it throws back to the start of the if-statement:

static int
readtape(char *buffer)
{
    static int rd;
    int i = -1, j;

again:  if (recno >= nblock || first == 0) {
        if (first == 0 && nblock == 1 && bflag == 0)
            j = NBLOCK;
        else
            j = nblock;
        if (volsize % TBLOCK*j) {
            fprintf(stderr, "%s: Volume size not a multiple "
                    "of block size.\n", progname);
            done(1);
        }
        if ((rd = i = mtread(tbuf, TBLOCK*j)) < 0) {
            fprintf(stderr, "%s: tape read error\n", progname);
            done(3);
        }
        if (maybezip && checkzip(tbuf[0].dummy, i) == 1)
            goto again;
        if (first == 0 || i == 0) {
            if ((i % TBLOCK) != 0 || i == 0) {
            tbe:    fprintf(stderr, "%s: tape blocksize error\n",
                        progname);
                done(3);
            }
            i /= TBLOCK;
            if (i != nblock) {
                if ((mtstat.st_mode&S_IFMT) == S_IFCHR &&
                        (i != 1 || bflag >= B_DEFN))
                    fprintf(stderr, "%s: blocksize = %d\n",
                        progname, i);
                nblock = i;
            }
            if (tapeblock == 0)
                tapeblock = i;
        }
        recno = 0;
    }
    first = 1;
    if ((rd -= TBLOCK) < 0)
        goto tbe;
    memcpy(buffer, &tbuf[recno++], TBLOCK);
    return(TBLOCK);
}

What?

tch69 commented 1 year ago

I haven't revert the new commit on the tarball detection part yet. Somehow that's the problem.

takusuman commented 1 year ago

I haven't revert the new commit on the tarball detection part yet. Somehow that's the problem.

Strange enough, even when I revert done(1) to goto again, the error persists. I will try your latest commit and, if we got any positive results, there's a merge.

takusuman commented 1 year ago

Well, maybe this is a temporary fix?

--- heirloom-ng.old/tar/tar.c   2022-10-24 22:28:22.711853994 -0300
+++ heirloom-ng/tar/tar.c   2022-10-24 22:30:26.418057555 -0300
@@ -2101,7 +2101,7 @@
            done(3);
        }
        if (maybezip && checkzip(tbuf[0].dummy, i) == 1)
-           done(1);
+           goto again;
        if (first == 0 || i == 0) {
            if ((i % TBLOCK) != 0 || i == 0) {
            tbe:    fprintf(stderr, "%s: tape blocksize error\n",
@@ -2737,7 +2737,7 @@
 {
    if (rd <= TBLOCK || (memcmp(&bp[257], "ustar", 5) &&
            memcmp(&bp[257], "\0\0\0\0\0", 5))) {
-       iif (bp[0] == 'B' && bp[1] == 'Z' && bp[2] == 'h')
+       if (bp[0] == 'B' && bp[1] == 'Z' && bp[2] == 'h')
            return redirect("bzip2", "-cd", rd);
        else if (bp[0] == '\37' && bp[1] == '\235')
            return redirect("zcat", NULL, rd);
@@ -2745,8 +2745,10 @@
            return redirect("lzip", "-cd", rd);
        else if (bp[0] == '\37' && bp[1] == '\213')
            return redirect("gzip", "-cd", rd);
-       else if (bp[0] == '\xFD'&& bp[2] == '7' && bp[3] == 'z' && bp[4] == 'X' && bp[5] == 'Z');
+       else if (bp[0] == '\xFD'&& bp[2] == '7' && bp[3] == 'z' && bp[4] == 'X' && bp[5] == 'Z')
            return redirect("xz", "-cd", rd);
+       else if (bp[0] == '\x28' && bp[1] == '\xB5' && bp[2] == '\x2F' && bp[3] == '\xFD');
+           return redirect("zstd", "-cd", rd);
    }
    maybezip = 0;
    return -1;

I think this still have the issue of the infinite loop, but at least we've got back support for zstd and xz.

Maybe this entire code needs refactor, who knows?

tch69 commented 1 year ago

I think it's working now (maybe?)

Damn it, I wish I actually learnt how to write C codes and use GitHub properly. I can't really understand how stuff worked.

takusuman commented 1 year ago

I think it's working now (maybe?)

Damn it, I wish I actually learnt how to write C codes and use GitHub properly. I can't really understand how stuff worked.

Neither I. I don't know a lot about C, only the basics for doing small hacks and fixes. Well, for now, I think we can merge.

EDIT: It seems like when we remove goto again and replace it with done(), it stops working again for compressed files.

takusuman commented 1 year ago

EDIT: It seems like when we remove goto again and replace it with done(), it stops working again for compressed files.

I really want to find a fix for this. I'm a little bit busy this week, but I will be reading the code in my free time to see how this checkzip() function is called. I already have an idea of what can be happening, one minute.

EDIT²: I think this function working by "wrapping" the tarball. I mean that, if it's just a plain tarball, it will extract it as-is. But, if it is compressed, it will first decompress and then extract the tarball that is expected to be inside the compressed file.

arthurbacci commented 1 year ago

I think it's working now (maybe?) Damn it, I wish I actually learnt how to write C codes and use GitHub properly. I can't really understand how stuff worked.

Neither I. I don't know a lot about C, only the basics for doing small hacks and fixes. Well, for now, I think we can merge.

EDIT: It seems like when we remove goto again and replace it with done(), it stops working again for compressed files.

In my view, removing the extra semicolon fixes it. The "infinite loop" actually seems to happen only when the checkzip function is not working properly, which was the case with the typo. I think the "temporary fix" is actually a fix :)

takusuman commented 1 year ago

That's what I call "Computer Alchemy". 🤣

takusuman commented 1 year ago

@tch69 Thanks a lot for the patches. I will be closing this pull request, but I will be adding your patches on the code.

takusuman commented 1 year ago

https://github.com/Projeto-Pindorama/heirloom-ng/commit/14a2ca0fd8bb881d7089b7c4f6c805d9ef16fcf8