kdave / btrfs-progs

Development of userspace BTRFS tools
GNU General Public License v2.0
530 stars 240 forks source link

mkfs.btrfs fails on s390, possible endianness error #639

Closed cmurf closed 1 year ago

cmurf commented 1 year ago

Downstream Bug 2214522 - Fedora 37: mkfs.btrfs fails on s390 with btrfs-progs 6.3.1 release

# mkfs.btrfs /dev/dasdd1
btrfs-progs v6.3.1
See https://btrfs.readthedocs.io/ for more information.

ERROR: superblock magic doesn't match
NOTE: several default settings have changed in version 5.15, please make sure
      this does not affect your deployments:
      - DUP for metadata (-m dup)
      - enabled no-holes (-O no-holes)
      - enabled free-space-tree (-R free-space-tree)

Invalid mapping for 1081344-1097728, got 17592186044416-17592190238720
Couldn't map the block 1081344
ERROR: cannot read chunk root
ERROR: open ctree failed

System: Fedora 37 Fails on both 6.3 and 6.4 kernels.

cmurf commented 1 year ago

We didn't build btrfs-progs 6.3 in Fedora so it could be a regression in 6.3 or 6.3.1.

sharkcz commented 1 year ago

I am going to bisect it, easily reproduced with make test on a s390x machine. Unless someone beats me being faster :-)

sharkcz commented 1 year ago

looks to me 6.3 is OK

sharkcz commented 1 year ago

c979ffd7872257beeae9ddc930f2be2423597e07 is the first bad commit and 164bc10dfc08d8754d23ef0d6d7281e139d6cd53 is the last good

sharkcz commented 1 year ago

If access to a s390x system is needed, please let me know, I can arrange that.

adam900710 commented 1 year ago

Mind to give me the access to an s390x system? You can contact me through wqu@suse.com.

adam900710 commented 1 year ago

Great thanks to @sharkcz for the s390x machine.

The direct cause is pretty straightforward, by somehow preprocessor didn't define __BIG_ENDIAN__, so we got incorrect keys and screws up everything.

Still looking into what's the proper macro to use to determine the endian.

sharkcz commented 1 year ago

You might use AC_C_BIGENDIAN if autotools are being used or rely on the compiler and __BYTE_ORDER__ being __ORDER_LITTLE_ENDIAN__ or __ORDER_BIG_ENDIAN__. There are some guides online about the right detection procedures.

adam900710 commented 1 year ago

I found <bits/endian.h> from glibc providing all the endianness checks, thus I would go __BYTE_ORDER checks against __LITTLE_ENDIAN to send the proper fix.

The synced accessor.h is using the kernel endianness check, which doesn't work with user space headers.

sharkcz commented 1 year ago

There is already an endianness check in https://github.com/kdave/btrfs-progs/blob/master/configure.ac#L45, so I would probably go with it, see eg. https://github.com/chrislim2888/IP2Location-C-Library/pull/58/files for usage. Rather than introducing another dependency and method ...

kdave commented 1 year ago

The header is the copy from kernel so we need to use a check that's compatible on the source level, so the autoconf WORD_BIGENDIAN is not suitable here.

kdave commented 1 year ago

kerncompat.h or kernel-lib/bitops.h use bitops.h:#if __BYTE_ORDER == __BIG_ENDIAN

kdave commented 1 year ago

Qu's fix applied to devel.

sharkcz commented 1 year ago

The header is the copy from kernel so we need to use a check that's compatible on the source level, so the autoconf WORD_BIGENDIAN is not suitable here.

yes, it makes sense, thanks