tytso / xfstests-bld

Creates a file system / storage test appliance which can be run using KVM, GCE, and Android
GNU General Public License v2.0
64 stars 46 forks source link

Touch is generating unexpected output #34

Open alexandrasp opened 2 years ago

alexandrasp commented 2 years ago

I've noticed a problem that could be related to the touch version during running xfstests using kvm-xfstests.

For tests cases generic.634 and generic.635 the .bad files contain the following output:

QA output created by 634
touch: invalid date format 'Feb 22 22:22:22 UTC 2222'
Silence is golden.
QA output created by 635
touch: invalid date format 'Feb 22 22:22:22 UTC 2222'

Maybe updating touch in VM could solve this.

tytso commented 2 years ago

It's not failing for me, and I've double-checked the test appliance which I uploaded to kernel.org.

Did you rebuild the test appliance? If so, what build environment did you use? Did you use a build chroot, as documented here[1]? My current distributed test appliance VM image uses Debian Bullseye (the current Debian Stable) as the VM image:

blktests 4e07b0c (Fri, 15 Jul 2022 14:40:03 +0900) fio fio-3.31 (Tue, 9 Aug 2022 14:41:25 -0600) fsverity v1.5 (Sun, 6 Feb 2022 10:59:13 -0800) ima-evm-utils v1.3.2 (Wed, 28 Oct 2020 13:18:08 -0400) nvme-cli v1.16 (Thu, 11 Nov 2021 13:09:06 -0800) quota v4.05-43-gd2256ac (Fri, 17 Sep 2021 14:04:16 +0200) util-linux v2.38.1 (Thu, 4 Aug 2022 11:06:21 +0200) xfsprogs v5.19.0 (Fri, 12 Aug 2022 13:45:01 -0500) xfstests v2022.08.21-8-g289f50f8 (Sun, 21 Aug 2022 15:21:34 -0400) xfstests-bld bb566bcf (Wed, 24 Aug 2022 23:07:24 -0400) zz_build-distro bullseye

[1] https://github.com/tytso/xfstests-bld/blob/master/Documentation/building-xfstests.md

If you are using some other build OS image, then there might be issues caused by the version of various distro-provided binaries, yes. That's ultimately a matter for upstream xfstests to deal with. See common/filter to see how we accomodate output changes, which are the most common way unstable upstream dependencies make life hard for us. For example:

commit 3b878b60a77e01656101df07e2e91a6600001903 Author: Gabriel Krisman Bertazi krisman@collabora.com Date: Fri May 20 13:18:52 2022 -0400

generic/556: Filter touch error message

Coreutils commit d435cfc0bc55 ("touch: fix wrong
diagnostic (Bug#48106)"), released in coreutils v9.0, changed the error
reported by the tool when openat() fails with EINVAL.  Instead of
reporting a generic message for the failure of either openat() or the
following utimensat(), it now differentiates both failures with
different messages.

This change breaks generic/556, which relied on the parsing of that
message.  This test was originally developed by me on a Debian
Buster (coreutils v8.x), so I used the generic error message.  Now that
I tried to run it on a more modern distro, it reports a different error
message, which fails the test, as shown below:

  output mismatch (see /tmp/results/generic/556.out.bad)
      --- tests/generic/556.out 2022-05-20 13:15:00.447525770 -0400
      +++ /tmp/results/generic/556.out.bad      2022-05-20 13:15:24.988167427 -0400
      @@ -12,5 +12,5 @@
       # file: SCRATCH_MNT/xattrs/x/f1
       user.foo="bar"

      -touch: setting times of 'SCRATCH_MNT/strict/corac'$'\314\247\303': Invalid argument
      -touch: setting times of 'SCRATCH_MNT/strict/cora'$'\303\247\303': Invalid argument
      +touch: cannot touch 'SCRATCH_MNT/strict/corac'$'\314\247\303': Invalid argument
      +touch: cannot touch 'SCRATCH_MNT/strict/cora'$'\303\247\303': Invalid argument
      ...

The fix filters out the touch-specific parts of the touch error
messages, to prevent breakage from future changes, but preserves the
return code information, which is actually useful (and more stable).

There is no change in behavior on the kernel side, just a broken test.
On both older and new distros, the kernel correctly rejects this invalid
sequence with -EINVAL, as shown in the strace hunk below:

  [...]
  openat(AT_FDCWD, "/scratch_mnt/strict/corac\314\247\303", ...) = -1 EINVAL
  utimensat(AT_FDCWD, "/scratch_mnt/strict/corac\314\247\303", ...) = -1 EINVAL
  [...]

Tested on Debian sid (coreutils v8.32) and Fedora (coreutils 9.0).

Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Zorro Lang <zlang@kernel.org>

In any case, I can't reproduce your problem with the version of the test appliance I am currently using, or the version of the test appliance which I uploaded to kernel.org. Can you say a bit more about the version of the test appliance you are using, and how it was built?

gwendalcr commented 1 year ago

I see the same issue with latest kvmimage from https://www.kernel.org/pub/linux/kernel/people/tytso/kvm-xfstests/root_fs.img.i386: FSTESTVER: blktests 676d42c (Thu, 2 Mar 2023 15:25:44 +0900) FSTESTVER: e2fsprogs archive/debian/1.47.0-1 (Mon, 6 Feb 2023 22:36:16 -0500) FSTESTVER: fio fio-3.31 (Tue, 9 Aug 2022 14:41:25 -0600) FSTESTVER: fsverity v1.5-6-g5d6f7c4 (Mon, 30 Jan 2023 23:22:45 -0800) FSTESTVER: ima-evm-utils v1.3.2 (Wed, 28 Oct 2020 13:18:08 -0400) FSTESTVER: nvme-cli v1.16 (Thu, 11 Nov 2021 13:09:06 -0800) FSTESTVER: quota v4.05-53-gd90b7d5 (Tue, 6 Dec 2022 12:59:03 +0100) FSTESTVER: util-linux v2.38.1 (Thu, 4 Aug 2022 11:06:21 +0200) FSTESTVER: xfsprogs v6.1.1 (Fri, 13 Jan 2023 19:06:37 +0100) FSTESTVER: xfstests v2023.02.26-8-g821ef488 (Thu, 2 Mar 2023 10:23:51 -0500) FSTESTVER: xfstests-bld 35650073 (Mon, 6 Mar 2023 20:48:08 -0500) FSTESTVER: zz_build-distro bullseye FSTESTCFG: 4k FSTESTSET: generic/634 FSTESTOPT: aex root@kvm-xfstests:~# touch --version touch (GNU coreutils) 8.32

root@kvm-xfstests:~# touch -d 'Feb 22 22:22:22 UTC 2222' /tmp/test
touch: invalid date format ‘Feb 22 22:22:22 UTC 2222’

It is not easy to reproduce: I could not get it to fail againafter rebooting the VM, reruning the smoke test.

Enclosed the strace outputs Inisde KVM, on my workstation and when it works.

In the bad case, touch is working in 64bit mode and wants to open "/usr/share/zoneinfo/XXX+00".

tytso commented 1 year ago

I can't replicate what you are reporting at all. Do you have TZ set to something unusual/strange? Either in ~/.config/kvm-xfstests or in your environment, perhaps? Specifically, what is getting passed to the guest VM as fstesttz? If you run "kvm-xfstests --no-action shell", you should see something like this at the end of the command line invoking qemu:

.... cmd=maint fstesttz=America/New_York fstesttyp=ext4 fstestapi=1.5 ...

When I run your experiment, it works just fine:

root@kvm-xfstests:~# touch -d 'Feb 22 22:22:22 UTC 2222' /tmp/test
root@kvm-xfstests:~# stat /tmp/test
  File: /tmp/test
  Size: 0               Blocks: 0          IO Block: 4096   regular empty file
Device: 1ch/28d Inode: 71          Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2222-02-22 17:22:22.000000000 -0500
Modify: 2222-02-22 17:22:22.000000000 -0500
Change: 2023-03-28 19:35:10.424249371 -0400
 Birth: 2023-03-28 19:35:10.424249371 -0400
root@kvm-xfstests:~# date
Tue Mar 28 19:35:16 EDT 2023
root@kvm-xfstests:~# touch --version
touch (GNU coreutils) 8.32
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Paul Rubin, Arnold Robbins, Jim Kingdon,
David MacKenzie, and Randy Smith.
root@kvm-xfstests:~# dpkg -l coreutils
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name           Version      Architecture Description
+++-==============-============-============-=================================
ii  coreutils      8.32-4+b1    amd64        GNU core utilities

So I don't know what to tell you. When you tell me that "it's not easy to reproduce", that suggest that it's not an fundamental issue with the version of touch, but something environmental --- like for example if you haven't set TZ in ~/.config/kvm-xfstests, and TZ is set to something weird in your environment.

From the root shell in kvm-xfstests, what does cat /etc/timezone and ls -l /etc/localtime report?

tytso commented 1 year ago

Ah.... I figured out what is happening. If you build a 32-bit kernel, then we will use a 32-bit root filesystem image for the OS.

For example:

% install-kernel --arch i386
...
% kbuild --arch i386
...
make[1]: Leaving directory '/build/ext4-32'
...
% kvm-xfstests --kernel /build/ext4-32 --no-action shell
...  -drive file=/usr/projects/xfstests-bld/build-64/test-appliance/root_fs.img.i386,if=virtio,snapshot=on ...

What's happening is that kvm-xfstests is auto-detecting that the kernel is a 32-bit i386 kernel, so it's using the i386 rootfs.img. (This is relatively new behavior, and was added as part of adding support for arm64. So you can now do install-kconfig --arch arm64 ; kbuild --arch arm64 ; kvm-xfstests shell and that will build a arm64 kernel and run it in qemu emulating an arm64 --- and at that point the "kvm" in kvm-xfstests isn't really accurate since kvm refers to an x86 specific hardware-assisted virtualization, but oh, well.)

In any case, the problem is that 'Feb 22 22:22:22 UTC 2222' can't be expressed using a 32-bit time_t. It requires a 64-bit time_t, and this is something that Debian doesn't currently support. See the slides[1] from a talk given at the 2023 FOSDEM conference and the Debian wiki page discussing a future release goals[2].

[1] http://wookware.org/talks/yr2038-fosdem.pdf [2] https://wiki.debian.org/ReleaseGoals/64bit-time

I'm not sure whether there is going to be volunteer effort available to address this for 32-bit x86. From the slides[1] it appears there is more interest in addressing it for the 32-bit armhf build. The problem with using a 64-bit time_t is that this breaks all shared libraries, so that means you need to handle this the same way we handled the libc5 to libc6 transition (and that was painful, and happened in the late 1990's --- so if you are younger than 25 years old, it happened before you were born :-).

In any case, I'm not sure there's much to do here. Ubuntu is about to drop 32-bit i386 support (after a huge outcry from the gaming community, they built a subset of the packages for 32 x86)[3] and said next release for sure, it'll be gone. Android is experimenting with dropping 32-bit support starting with the Pixel 7 devices[4]. Aside from radiation-hardened x86 chips for space, you can't even buy 32-only x86 chips from Intel any more (and I' m not sure those are actually still available in 2023); ARM Cortex-A processors will be 64-bit only starting this year[5]. So from both the hardware and distribution point of view, 32-bit is, if not dead, is definitely "pining for the fjords"[6].

[3] https://www.theregister.com/2019/06/19/ubuntu_axes_i386_support/ [4] https://www.androidpolice.com/pixel-7-and-pixel-7-pro-32-bit-android-apps/ [5] https://www.xda-developers.com/arm-future-chips-32-bit-2023/ [6] https://www.youtube.com/watch?v=vZw35VUBdzo

So I'm inclined to close this as a WONTFIX. If there is someone who wants to volunteer to help Debian shepherd a new shared library ABI which supports a 64-bit time_t for i386, that's best done in Debian. Given that Debian Bookworm (Debian 12) is currently frozen for release, the earliest this could happen would be Debian Trixie (Debian 13), which will probably released in 2025. And the effort in doing the 64-bit time_t migration for i386 is such that you had better start now, if you want to have a chance of it happening....