sfjro / aufs-standalone

29 stars 13 forks source link

kernel 5.16.12 & 5.15.25 & 5.10.103 issue with /aufs/fsctx.c #9

Closed peabee closed 2 years ago

peabee commented 2 years ago

An issue has arisen with kernel 5.16.12 patched with aufs and used in Puppy Linux. The issue was not present in kernel 5.16.7 A system crash occurs when initial shutdown occurs and a new persistent user save area should be created The crash message mentions au_fsctx_parse_monolithic which is a function in aufs/fsctx.c

IMG2_20220306

peabee commented 2 years ago

Kernel 5.15.25 also crashes on shutdown.

sfjro commented 2 years ago

Hello PB,

I've got an empty message from you via github. But it's OK since I've checked https://github.com/sfjro/aufs5-standalone/issues/9 and saw the problem. I cannot reproduce the problem but I can guess the cause. Here is a patch for you. Please test.

J. R. Okajima

sfjro commented 2 years ago

I've got an empty message from you via github. But it's OK since I've checked https://github.com/sfjro/aufs5-standalone/issues/9 and saw the problem. I cannot reproduce the problem but I can guess the cause. Here is a patch for you. Please test.

And next time when you report a problem, please describe the first line of the kernel message, such as "NULL pointer is accessed".

J. R. Okajima

sfjro commented 2 years ago

I've got an empty message from you via github. But it's OK since I've checked https://github.com/sfjro/aufs5-standalone/issues/9 and saw the problem. I cannot reproduce the problem but I can guess the cause. Here is a patch for you. Please test.

Ah, github ate the attached patch in my previous messages. Here I include it as a plain text. Please test.

J. R. Okajima

diff --git a/fs/aufs/fsctx.c b/fs/aufs/fsctx.c index aa9b444438c20..739531e2d2908 100644 --- a/fs/aufs/fsctx.c +++ b/fs/aufs/fsctx.c @@ -1091,7 +1091,7 @@ static int au_fsctx_parse_monolithic(struct fs_context fc, void data)

str = data;
AuDbg("str %s\n", str);
peabee commented 2 years ago

My apologies for the blank message - I accidentally hit the button before completing the entry - sorry. I believe that I have done a build of 5.16.12 with the patch you have suggested, but sadly the crash still occurs. The crash report says:

BUG: kernel NULL pointer dereference

PF: supervisor read access in kernel mode

PF: error_code(0x0000) - not-present page

sfjro commented 2 years ago

PB:

I believe that I have done a build of 5.16.12 with the patch you have suggested, but sadly the crash still occurs. The crash report says:

BUG: kernel NULL pointer dereference

Hmm, am I going totally wrong direction? (I'm talking to myself) To confirm this, would you try this patch and test again?

J. R. Okajima

diff --git a/fs/aufs/fsctx.c b/fs/aufs/fsctx.c index e5622fc17..23c3fa19a 100644 --- a/fs/aufs/fsctx.c +++ b/fs/aufs/fsctx.c @@ -1100,8 +1100,12 @@ static int au_fsctx_parse_monolithic(struct fs_context fc, void data) int err; unsigned int u; char *str;

+out: return err; }

peabee commented 2 years ago

Yes - thank you - with the 2nd patch the crashes stop......... kernel 5.16.12 :-)) I can't apply the patch to the 5.15.25 kernel due to the way it is built.... I also didn't check whether 5.10 is also affected........ but can do so if needed

static int au_fsctx_parse_monolithic(struct fs_context *fc, void *data)
{
    int err;
    unsigned int u;
    char *str;
    struct au_fsctx_opts *a;

    err = 0;
    if (!fc || !data)
    goto out;
    a = fc->fs_private;
    str = data;
    AuDbg("str %s\n", str);
    while (str) {
        u = is_colonopt(str);
        if (u)
            str[u] = '=';
        str = strchr(str, ',');
        if (!str)
            break;
        str++;
    }
    str = data;
    AuDbg("str %s\n", str);

    err = generic_parse_monolithic(fc, str);
    AuTraceErr(err);
    au_fsctx_dump(&a->opts);

out:
    return err;
}
peabee commented 2 years ago

Have now checked - 5.10.103 also needs the patch

sfjro commented 2 years ago

PB:

Yes - thank you - with the 2nd patch the crashes stop......... kernel 5.16.12 :-))

Thanx for testing. But yet I am not fully convinced. Did you enable CONFIG_AUFS_DEBUG in the kernel configuration? And set the module parameter "debug=1" or ran "AuDebug 1" (which is a shell function defined in /etc/default/aufs)? Otherwise the problem should not happen I am afraid.

The file fs/aufs/fsctx.c was introduced aufs5.10 and after, and every version potentially has this bug. The reproducible conditions are

If I send you another patch, kindly would you test it?

J. R. Okajima

peabee commented 2 years ago
# cat DOTconfig-5.16.12-lxpup64 | grep AUFS
CONFIG_AUFS_FS=y
CONFIG_AUFS_BRANCH_MAX_127=y
# CONFIG_AUFS_BRANCH_MAX_511 is not set
# CONFIG_AUFS_BRANCH_MAX_1023 is not set
# CONFIG_AUFS_BRANCH_MAX_32767 is not set
CONFIG_AUFS_SBILIST=y
CONFIG_AUFS_HNOTIFY=y
CONFIG_AUFS_HFSNOTIFY=y
CONFIG_AUFS_EXPORT=y
CONFIG_AUFS_INO_T_64=y
CONFIG_AUFS_XATTR=y
# CONFIG_AUFS_FHSM is not set
# CONFIG_AUFS_RDU is not set
# CONFIG_AUFS_DIRREN is not set
# CONFIG_AUFS_SHWH is not set
# CONFIG_AUFS_BR_RAMFS is not set
# CONFIG_AUFS_BR_FUSE is not set
# CONFIG_AUFS_BR_HFSPLUS is not set
CONFIG_AUFS_BDEV_LOOP=y
# CONFIG_AUFS_DEBUG is not set

CONFIG_AUFS_DEBUG is not set

If I send you another patch, kindly would you test it?

Certainly

sfjro commented 2 years ago

PB:

CONFIG_AUFS_DEBUG is not set

OK, thanx.

If I send you another patch, kindly would you test it?

Certainly

I was going to send you more debugging patch, but I still don't understand why my first patch didn't solve the problem. It was a one-liner patch, replacing "while (1)" by "while (src)". I have simulated your case by chroot, and succeeded reproducing the problem. And the one-liner patch solved the problem. But not on your side. I can't understand the situation. So shamelessly I'd ask you again try the first one-liner patch please.

J. R. Okajima

peabee commented 2 years ago

And the one-liner patch solved the problem. But not on your side. I can't understand the situation. So shamelessly I'd ask you again try the first one-liner patch please.

Dear @sfjro My sincere apologies - I do not know what happened but I must not have applied the one-liner patch correctly somehow!!!

I can confirm, following a rebuild, that 5.6.12, with the one-liner patch, does NOT crash.

Very sorry to have confused matters Regards @peabee

sfjro commented 2 years ago

PB:

I can confirm, following a rebuild, that 5.6.12, with the one-liner patch, does NOT crash.

Very sorry to have confused matters

Thank you very much for your tests several times. Now I can see the bug scenario clearly, and my sleepless nights end today. :-)

J. R. Okajima

sfjro commented 2 years ago

------- Blind-Carbon-Copy

From: "J. R. Okajima" @.> To: @. Cc: @. Subject: aufs5 GIT release (v5.17-rc7) MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: @.> Date: Mon, 14 Mar 2022 12:33:48 +0900 Message-ID: @.***>

o bugfix

J. R. Okajima


------- End of Blind-Carbon-Copy