openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.63k stars 1.75k forks source link

ZFS whole filesystem/volume corruption (ZFS-8000-8A) after receiving raw encrypted snapshots #9811

Closed PaulGrandperrin closed 3 years ago

PaulGrandperrin commented 4 years ago

Sender is up-to-date macOS Mojave with openZFS 1.9.3.1,64 with kernel Darwin macbookpro2018-perso.local 19.2.0 Darwin Kernel Version 19.2.0: Sat Nov 9 03:47:04 PST 2019; root:xnu-6153.61.1~20/RELEASE_X86_64 x86_64 Receiver is up-to-date Debian Buster 10.2 with zfsonlinux 0.8.2-3~bpo10+1 and kernel Linux debianas 4.19.0-6-amd64 #1 SMP Debian 4.19.67-2+deb10u2 (2019-11-11) x86_64 GNU/Linux

Summary

I sent a tree of raw encrypted incremental snapshots to a remote server and it corrupted all the corresponding remote volumes. I created a checkpoint and tried to roll back to the last snapshot but the volumes are still corrupted and data is still inaccessible. Rebooting didn't help.

There are no logs in dmesg about ZFS or IO errors.

I'm available to help as much as I can, I'm a huge fan of your work :-)

Here are more details:

Details

Volumes on the sender

 $  zfs list -t all 
NAME                                          USED  AVAIL  REFER  MOUNTPOINT
backup                                       1.29T   474G   548K  /Volumes/backup
backup/encrypted                             1.29T   474G  1.99M  /Volumes/backup/encrypted
backup/encrypted@2019:08:24-13:07             160K      -   224K  /Volumes/backup/encrypted/.zfs/snapshot/2019:08:24-13:07
backup/encrypted@2019:09:19-13:27            1.92M      -  2.84M  /Volumes/backup/encrypted/.zfs/snapshot/2019:09:19-13:27
backup/encrypted@2020:01:05-18:46                0      -  1.99M  /Volumes/backup/encrypted/.zfs/snapshot/2020:01:05-18:46
backup/encrypted/paulg                       1.24T   474G  1.24T  /Volumes/backup/encrypted/paulg
backup/encrypted/paulg@2019:08:20-19:35       336K      -   904G  /Volumes/backup/encrypted/paulg/.zfs/snapshot/2019:08:20-19:35
backup/encrypted/paulg@2019:08:24-13:07       352K      -  1.18T  /Volumes/backup/encrypted/paulg/.zfs/snapshot/2019:08:24-13:07
backup/encrypted/paulg@2019:09:19-13:27      5.25M      -  1.22T  /Volumes/backup/encrypted/paulg/.zfs/snapshot/2019:09:19-13:27
backup/encrypted/paulg@2020:01:05-18:46       332K      -  1.24T  /Volumes/backup/encrypted/paulg/.zfs/snapshot/2020:01:05-18:46
backup/encrypted/veronique                   58.4G   474G  58.4G  /Volumes/backup/encrypted/veronique
backup/encrypted/veronique@2019:08:24-13:07   224K      -  58.4G  /Volumes/backup/encrypted/veronique/.zfs/snapshot/2019:08:24-13:07
backup/encrypted/veronique@2019:09:19-13:27   620K      -  58.4G  /Volumes/backup/encrypted/veronique/.zfs/snapshot/2019:09:19-13:27
backup/encrypted/veronique@2020:01:05-18:46      0      -  58.4G  /Volumes/backup/encrypted/veronique/.zfs/snapshot/2020:01:05-18:46

Volumes on the receiver before the send/receive operation:

$  zfs list -t all                                                                              
NAME                                           USED  AVAIL     REFER  MOUNTPOINT
storage                                       7.61T  50.4G     33.3G  /storage
storage/debian                                55.4G  50.4G     29.7G  /
storage/debian@2019:07:29-02:00                374M      -     21.6G  -
storage/debian@2019:08:05-02:00                326M      -     21.6G  -
storage/debian@2019:08:12-02:00                318M      -     20.6G  -
storage/debian@2019:08:19-02:00                375M      -     21.1G  -
storage/debian@2019:08:26-02:00                144M      -     18.9G  -
storage/debian@2019:09:02-02:00                166M      -     19.0G  -
storage/debian@2019:09:09-02:00                167M      -     19.3G  -
storage/debian@2019:09:23-02:00                124M      -     19.2G  -
storage/debian@2019:09:30-02:00                116M      -     19.2G  -
storage/debian@2019:10:07-02:00                128M      -     19.3G  -
storage/debian@2019:10:14-02:00                144M      -     19.4G  -
storage/debian@2019:10:21-02:00                112M      -     19.5G  -
storage/debian@2019:10:28-03:00                108M      -     19.6G  -
storage/debian@2019:11:04-03:00                107M      -     20.1G  -
storage/debian@2019:11:11-03:00               4.72G      -     31.2G  -
storage/debian@2019:11:18-03:00                668M      -     27.0G  -
storage/debian@2019:11:25-03:00               48.2M      -     27.1G  -
storage/debian@snapme                         25.4M      -     27.2G  -
storage/debian@2019:12:02-03:00               52.5M      -     27.2G  -
storage/debian@2019:12:09-03:00               97.1M      -     29.9G  -
storage/debian@2019:12:16-03:00               51.7M      -     29.9G  -
storage/debian@2019:12:23-03:00               84.4M      -     29.8G  -
storage/debian@2019:12:30-03:00                336M      -     29.8G  -
storage/encrypted                             1.28T  50.4G     3.11M  /storage/encrypted
storage/encrypted@2019:08:24-13:07             209K      -      326K  -
storage/encrypted@2019:09:19-13:27               0B      -     3.11M  -
storage/encrypted/paulg                       1.22T  50.4G     1.22T  /storage/encrypted/paulg
storage/encrypted/paulg@2019:08:20-19:35       418K      -      907G  -
storage/encrypted/paulg@2019:08:24-13:07       407K      -     1.19T  -
storage/encrypted/paulg@2019:09:19-13:27         0B      -     1.22T  -
storage/encrypted/veronique                   58.3G  50.4G     58.3G  /storage/encrypted/veronique
storage/encrypted/veronique@2019:08:24-13:07   267K      -     58.3G  -
storage/encrypted/veronique@2019:09:19-13:27     0B      -     58.3G  -
storage/paulg                                 14.3G  50.4G     8.66G  /storage/paulg
storage/paulg@2019:08:19-02:00                 117M      -     3.51G  -
storage/paulg@2019:08:26-02:00                 587K      -     3.39G  -
storage/paulg@2019:09:02-02:00                 593K      -     3.39G  -
storage/paulg@2019:09:09-02:00                 692K      -     3.39G  -
storage/paulg@2019:09:23-02:00                 599K      -     3.39G  -
storage/paulg@2019:09:30-02:00                 587K      -     3.39G  -
storage/paulg@2019:10:07-02:00                 668K      -     3.39G  -
storage/paulg@2019:10:14-02:00                 674K      -     3.39G  -
storage/paulg@2019:10:21-02:00                 604K      -     3.39G  -
storage/paulg@2019:10:28-03:00                 604K      -     3.39G  -
storage/paulg@2019:11:04-03:00                 884K      -     4.39G  -
storage/paulg@2019:11:11-03:00                 483M      -     9.12G  -
storage/paulg@2019:11:18-03:00                1.30G      -     9.94G  -
storage/paulg@2019:11:25-03:00                2.74G      -     11.9G  -
storage/paulg@2019:12:02-03:00                1.73M      -     9.21G  -
storage/paulg@2019:12:09-03:00                 634K      -     8.65G  -
storage/paulg@2019:12:16-03:00                 634K      -     8.65G  -
storage/paulg@2019:12:23-03:00                 639K      -     8.65G  -
storage/paulg@2019:12:30-03:00                 924K      -     8.66G  -
storage/shared                                6.23T  50.4G     5.39T  /storage/shared
storage/shared@2019:08:12-02:00                  0B      -     5.40T  -
storage/shared@2019:08:19-02:00                  0B      -     5.40T  -
storage/shared@2019:08:26-02:00                  0B      -     5.36T  -
storage/shared@2019:09:02-02:00                  0B      -     5.36T  -
storage/shared@2019:09:09-02:00                  0B      -     5.36T  -
storage/shared@2019:09:23-02:00                  0B      -     5.36T  -
storage/shared@2019:09:30-02:00                  0B      -     5.36T  -
storage/shared@2019:10:07-02:00                  0B      -     5.36T  -
storage/shared@2019:10:14-02:00                  0B      -     5.36T  -
storage/shared@2019:10:21-02:00                  0B      -     5.36T  -
storage/shared@2019:10:28-03:00                  0B      -     5.36T  -
storage/shared@2019:11:04-03:00                  0B      -     5.36T  -
storage/shared@2019:11:11-03:00               11.6K      -     5.36T  -
storage/shared@2019:11:18-03:00                116K      -     5.37T  -
storage/shared@2019:11:25-03:00                  0B      -     6.16T  -
storage/shared@2019:12:02-03:00                  0B      -     6.16T  -
storage/shared@2019:12:09-03:00                  0B      -     6.16T  -
storage/shared@2019:12:16-03:00                  0B      -     6.16T  -
storage/shared@2019:12:23-03:00                  0B      -     6.16T  -
storage/shared@2019:12:30-03:00                  0B      -     5.39T  -
storage/snapme                                7.79M  50.4G     27.1G  /storage/snapme

The command to send the snapshots (finished without errors):

zfs send -DRwh -I @2019:09:19-13:27 backup/encrypted@2020:01:05-18:46 | ssh root@nas.paulg.fr zfs receive -sF storage/encrypted

Volumes on the receiver after the send/receive operation:

$ zfs list -t all  
NAME                                           USED  AVAIL     REFER  MOUNTPOINT
storage                                       7.62T  31.7G     33.3G  /storage
storage/debian                                55.4G  31.7G     29.7G  /
storage/debian@2019:07:29-02:00                374M      -     21.6G  -
storage/debian@2019:08:05-02:00                326M      -     21.6G  -
storage/debian@2019:08:12-02:00                318M      -     20.6G  -
storage/debian@2019:08:19-02:00                375M      -     21.1G  -
storage/debian@2019:08:26-02:00                144M      -     18.9G  -
storage/debian@2019:09:02-02:00                166M      -     19.0G  -
storage/debian@2019:09:09-02:00                167M      -     19.3G  -
storage/debian@2019:09:23-02:00                124M      -     19.2G  -
storage/debian@2019:09:30-02:00                116M      -     19.2G  -
storage/debian@2019:10:07-02:00                128M      -     19.3G  -
storage/debian@2019:10:14-02:00                144M      -     19.4G  -
storage/debian@2019:10:21-02:00                112M      -     19.5G  -
storage/debian@2019:10:28-03:00                108M      -     19.6G  -
storage/debian@2019:11:04-03:00                107M      -     20.1G  -
storage/debian@2019:11:11-03:00               4.72G      -     31.2G  -
storage/debian@2019:11:18-03:00                668M      -     27.0G  -
storage/debian@2019:11:25-03:00               48.2M      -     27.1G  -
storage/debian@snapme                         25.4M      -     27.2G  -
storage/debian@2019:12:02-03:00               52.5M      -     27.2G  -
storage/debian@2019:12:09-03:00               97.1M      -     29.9G  -
storage/debian@2019:12:16-03:00               51.7M      -     29.9G  -
storage/debian@2019:12:23-03:00               84.4M      -     29.8G  -
storage/debian@2019:12:30-03:00                336M      -     29.8G  -
storage/encrypted                             1.30T  31.7G     2.29M  /storage/encrypted
storage/encrypted@2019:08:24-13:07             209K      -      326K  -
storage/encrypted@2019:09:19-13:27            2.04M      -     3.11M  -
storage/encrypted@2020:01:05-18:46               0B      -     2.29M  -
storage/encrypted/paulg                       1.24T  31.7G     1.24T  /storage/encrypted/paulg
storage/encrypted/paulg@2019:08:20-19:35       418K      -      907G  -
storage/encrypted/paulg@2019:08:24-13:07       407K      -     1.19T  -
storage/encrypted/paulg@2019:09:19-13:27      5.82M      -     1.22T  -
storage/encrypted/paulg@2020:01:05-18:46         0B      -     1.24T  -
storage/encrypted/veronique                   58.3G  31.7G     58.3G  /storage/encrypted/veronique
storage/encrypted/veronique@2019:08:24-13:07   267K      -     58.3G  -
storage/encrypted/veronique@2019:09:19-13:27   738K      -     58.3G  -
storage/encrypted/veronique@2020:01:05-18:46     0B      -     58.3G  -
storage/paulg                                 14.3G  31.7G     8.66G  /storage/paulg
storage/paulg@2019:08:19-02:00                 117M      -     3.51G  -
storage/paulg@2019:08:26-02:00                 587K      -     3.39G  -
storage/paulg@2019:09:02-02:00                 593K      -     3.39G  -
storage/paulg@2019:09:09-02:00                 692K      -     3.39G  -
storage/paulg@2019:09:23-02:00                 599K      -     3.39G  -
storage/paulg@2019:09:30-02:00                 587K      -     3.39G  -
storage/paulg@2019:10:07-02:00                 668K      -     3.39G  -
storage/paulg@2019:10:14-02:00                 674K      -     3.39G  -
storage/paulg@2019:10:21-02:00                 604K      -     3.39G  -
storage/paulg@2019:10:28-03:00                 604K      -     3.39G  -
storage/paulg@2019:11:04-03:00                 884K      -     4.39G  -
storage/paulg@2019:11:11-03:00                 483M      -     9.12G  -
storage/paulg@2019:11:18-03:00                1.30G      -     9.94G  -
storage/paulg@2019:11:25-03:00                2.74G      -     11.9G  -
storage/paulg@2019:12:02-03:00                1.73M      -     9.21G  -
storage/paulg@2019:12:09-03:00                 634K      -     8.65G  -
storage/paulg@2019:12:16-03:00                 634K      -     8.65G  -
storage/paulg@2019:12:23-03:00                 639K      -     8.65G  -
storage/paulg@2019:12:30-03:00                 924K      -     8.66G  -
storage/shared                                6.23T  31.7G     5.39T  /storage/shared
storage/shared@2019:08:12-02:00                  0B      -     5.40T  -
storage/shared@2019:08:19-02:00                  0B      -     5.40T  -
storage/shared@2019:08:26-02:00                  0B      -     5.36T  -
storage/shared@2019:09:02-02:00                  0B      -     5.36T  -
storage/shared@2019:09:09-02:00                  0B      -     5.36T  -
storage/shared@2019:09:23-02:00                  0B      -     5.36T  -
storage/shared@2019:09:30-02:00                  0B      -     5.36T  -
storage/shared@2019:10:07-02:00                  0B      -     5.36T  -
storage/shared@2019:10:14-02:00                  0B      -     5.36T  -
storage/shared@2019:10:21-02:00                  0B      -     5.36T  -
storage/shared@2019:10:28-03:00                  0B      -     5.36T  -
storage/shared@2019:11:04-03:00                  0B      -     5.36T  -
storage/shared@2019:11:11-03:00               11.6K      -     5.36T  -
storage/shared@2019:11:18-03:00                116K      -     5.37T  -
storage/shared@2019:11:25-03:00                  0B      -     6.16T  -
storage/shared@2019:12:02-03:00                  0B      -     6.16T  -
storage/shared@2019:12:09-03:00                  0B      -     6.16T  -
storage/shared@2019:12:16-03:00                  0B      -     6.16T  -
storage/shared@2019:12:23-03:00                  0B      -     6.16T  -
storage/shared@2019:12:30-03:00                  0B      -     5.39T  -
storage/snapme                                7.79M  31.7G     27.1G  /storage/snapme

Trying to mount the newly received volumes on the receiver:

$ zfs mount -a -l                                                                                               
Enter passphrase for 'storage/encrypted':
filesystem 'storage/encrypted' can not be mounted: Input/output error
cannot mount 'storage/encrypted': Invalid argument
filesystem 'storage/encrypted/paulg' can not be mounted: Input/output error
cannot mount 'storage/encrypted/paulg': Invalid argument
filesystem 'storage/encrypted/veronique' can not be mounted: Input/output error
cannot mount 'storage/encrypted/veronique': Invalid argument

I think my heart skipped a beat at that point, but yes, of course, I do have offline backups (not fully up-to-date through...)

Checking the health of the receiver zpool:

$ zpool status -xv 
  pool: storage
 state: ONLINE
status: One or more devices has experienced an error resulting in data
    corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
    entire pool from backup.
   see: http://zfsonlinux.org/msg/ZFS-8000-8A
  scan: scrub repaired 0B in 0 days 19:05:09 with 0 errors on Sun Dec  8 19:29:13 2019
config:

    NAME                              STATE     READ WRITE CKSUM
    storage                           ONLINE       0     0     0
      raidz1-0                        ONLINE       0     0     0
        sde1                          ONLINE       0     0     0
        sdf1                          ONLINE       0     0     0
        sdc1                          ONLINE       0     0     0
        sdd1                          ONLINE       0     0     0
    logs
      mirror-1                        ONLINE       0     0     0
        wwn-0x55cd2e404b6ea37d-part1  ONLINE       0     0     0
        wwn-0x55cd2e404b6ea368-part1  ONLINE       0     0     0
    cache
      wwn-0x55cd2e404b6ea37d-part2    ONLINE       0     0     0
      wwn-0x55cd2e404b6ea368-part2    ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

        storage/encrypted/paulg:<0x0>
        storage/encrypted/veronique:<0x0>
        storage/encrypted:<0x0>

Properties on the sender

$ zpool get all backup                                                                                       Sun Jan  5 23:47:34 2020
NAME    PROPERTY                       VALUE                          SOURCE
backup  size                           1.81T                          -
backup  capacity                       71%                            -
backup  altroot                        -                              default
backup  health                         ONLINE                         -
backup  guid                           16408403743514808226           default
backup  version                        -                              default
backup  bootfs                         -                              default
backup  delegation                     on                             default
backup  autoreplace                    off                            default
backup  cachefile                      -                              default
backup  failmode                       wait                           default
backup  listsnapshots                  off                            default
backup  autoexpand                     off                            default
backup  dedupditto                     0                              default
backup  dedupratio                     1.00x                          -
backup  free                           532G                           -
backup  allocated                      1.29T                          -
backup  readonly                       off                            -
backup  ashift                         12                             local
backup  comment                        -                              default
backup  expandsize                     -                              -
backup  freeing                        0                              default
backup  fragmentation                  3%                             -
backup  leaked                         0                              default
backup  checkpoint                     696K                           -
backup  multihost                      off                            default
backup  autotrim                       off                            default
backup  feature@async_destroy          enabled                        local
backup  feature@empty_bpobj            active                         local
backup  feature@lz4_compress           active                         local
backup  feature@multi_vdev_crash_dump  enabled                        local
backup  feature@spacemap_histogram     active                         local
backup  feature@enabled_txg            active                         local
backup  feature@hole_birth             active                         local
backup  feature@extensible_dataset     active                         local
backup  feature@embedded_data          active                         local
backup  feature@bookmarks              enabled                        local
backup  feature@filesystem_limits      enabled                        local
backup  feature@large_blocks           enabled                        local
backup  feature@large_dnode            enabled                        local
backup  feature@sha512                 active                         local
backup  feature@skein                  enabled                        local
backup  feature@edonr                  enabled                        local
backup  feature@encryption             active                         local
backup  feature@device_removal         enabled                        local
backup  feature@obsolete_counts        enabled                        local
backup  feature@zpool_checkpoint       active                         local
backup  feature@spacemap_v2            active                         local
backup  feature@allocation_classes     enabled                        local
backup  feature@bookmark_v2            enabled                        local
backup  feature@resilver_defer         enabled                        local

$ zfs get all backup/encrypted 
NAME              PROPERTY               VALUE                      SOURCE
backup/encrypted  type                   filesystem                 -
backup/encrypted  creation               Sun Aug 25 11:01 2019      -
backup/encrypted  used                   1.29T                      -
backup/encrypted  available              474G                       -
backup/encrypted  referenced             1.99M                      -
backup/encrypted  compressratio          1.06x                      -
backup/encrypted  mounted                yes                        -
backup/encrypted  quota                  none                       default
backup/encrypted  reservation            none                       default
backup/encrypted  recordsize             128K                       default
backup/encrypted  mountpoint             /Volumes/backup/encrypted  default
backup/encrypted  sharenfs               off                        default
backup/encrypted  checksum               sha512                     received
backup/encrypted  compression            lz4                        received
backup/encrypted  atime                  off                        inherited from backup
backup/encrypted  devices                off                        received
backup/encrypted  exec                   on                         default
backup/encrypted  setuid                 on                         default
backup/encrypted  readonly               off                        default
backup/encrypted  zoned                  off                        default
backup/encrypted  snapdir                hidden                     default
backup/encrypted  aclmode                passthrough                default
backup/encrypted  aclinherit             restricted                 default
backup/encrypted  canmount               on                         default
backup/encrypted  xattr                  on                         default
backup/encrypted  copies                 1                          default
backup/encrypted  version                5                          -
backup/encrypted  utf8only               on                         -
backup/encrypted  normalization          none                       -
backup/encrypted  casesensitivity        sensitive                  -
backup/encrypted  vscan                  off                        default
backup/encrypted  nbmand                 off                        default
backup/encrypted  sharesmb               off                        default
backup/encrypted  refquota               none                       default
backup/encrypted  refreservation         none                       default
backup/encrypted  primarycache           all                        default
backup/encrypted  secondarycache         all                        default
backup/encrypted  usedbysnapshots        2.08M                      -
backup/encrypted  usedbydataset          1.99M                      -
backup/encrypted  usedbychildren         1.29T                      -
backup/encrypted  usedbyrefreservation   0                          -
backup/encrypted  logbias                throughput                 received
backup/encrypted  dedup                  off                        default
backup/encrypted  mlslabel               none                       default
backup/encrypted  sync                   standard                   default
backup/encrypted  dnodesize              legacy                     default
backup/encrypted  refcompressratio       2.53x                      -
backup/encrypted  written                0                          -
backup/encrypted  logicalused            1.37T                      -
backup/encrypted  logicalreferenced      3.94M                      -
backup/encrypted  filesystem_limit       none                       default
backup/encrypted  snapshot_limit         none                       default
backup/encrypted  filesystem_count       none                       default
backup/encrypted  snapshot_count         none                       default
backup/encrypted  snapdev                hidden                     default
backup/encrypted  com.apple.browse       on                         default
backup/encrypted  com.apple.ignoreowner  off                        default
backup/encrypted  com.apple.mimic_hfs    off                        default
backup/encrypted  com.apple.devdisk      poolonly                   default
backup/encrypted  shareafp               off                        default
backup/encrypted  redundant_metadata     all                        default
backup/encrypted  overlay                off                        default
backup/encrypted  encryption             aes-256-gcm                -
backup/encrypted  keylocation            prompt                     local
backup/encrypted  keyformat              passphrase                 -
backup/encrypted  pbkdf2iters            342K                       -
backup/encrypted  encryptionroot         backup/encrypted           -
backup/encrypted  keystatus              available                  -
backup/encrypted  special_small_blocks   0                          default

Properties on the receiver:

$ zpool get all storage
NAME     PROPERTY                       VALUE                          SOURCE
storage  size                           10.9T                          -
storage  capacity                       96%                            -
storage  altroot                        -                              default
storage  health                         ONLINE                         -
storage  guid                           10399772299636074206           -
storage  version                        -                              default
storage  bootfs                         storage/debian                 local
storage  delegation                     on                             default
storage  autoreplace                    off                            default
storage  cachefile                      -                              default
storage  failmode                       wait                           default
storage  listsnapshots                  off                            default
storage  autoexpand                     on                             local
storage  dedupditto                     0                              default
storage  dedupratio                     1.00x                          -
storage  free                           392G                           -
storage  allocated                      10.5T                          -
storage  readonly                       off                            -
storage  ashift                         12                             local
storage  comment                        -                              default
storage  expandsize                     -                              -
storage  freeing                        0                              -
storage  fragmentation                  53%                            -
storage  leaked                         0                              -
storage  multihost                      off                            default
storage  checkpoint                     25.7G                          -
storage  load_guid                      731974003834945038             -
storage  autotrim                       on                             local
storage  feature@async_destroy          enabled                        local
storage  feature@empty_bpobj            active                         local
storage  feature@lz4_compress           active                         local
storage  feature@multi_vdev_crash_dump  enabled                        local
storage  feature@spacemap_histogram     active                         local
storage  feature@enabled_txg            active                         local
storage  feature@hole_birth             active                         local
storage  feature@extensible_dataset     active                         local
storage  feature@embedded_data          active                         local
storage  feature@bookmarks              enabled                        local
storage  feature@filesystem_limits      enabled                        local
storage  feature@large_blocks           enabled                        local
storage  feature@large_dnode            enabled                        local
storage  feature@sha512                 active                         local
storage  feature@skein                  enabled                        local
storage  feature@edonr                  enabled                        local
storage  feature@userobj_accounting     active                         local
storage  feature@encryption             active                         local
storage  feature@project_quota          active                         local
storage  feature@device_removal         enabled                        local
storage  feature@obsolete_counts        enabled                        local
storage  feature@zpool_checkpoint       active                         local
storage  feature@spacemap_v2            active                         local
storage  feature@allocation_classes     enabled                        local
storage  feature@resilver_defer         enabled                        local
storage  feature@bookmark_v2            enabled                        local

$ zfs get all storage/encrypted
NAME               PROPERTY              VALUE                  SOURCE
storage/encrypted  type                  filesystem             -
storage/encrypted  creation              Mon Aug 12 14:17 2019  -
storage/encrypted  used                  1.28T                  -
storage/encrypted  available             24.7G                  -
storage/encrypted  referenced            3.11M                  -
storage/encrypted  compressratio         1.06x                  -
storage/encrypted  mounted               no                     -
storage/encrypted  quota                 none                   default
storage/encrypted  reservation           none                   default
storage/encrypted  recordsize            128K                   default
storage/encrypted  mountpoint            /storage/encrypted     default
storage/encrypted  sharenfs              off                    default
storage/encrypted  checksum              sha512                 received
storage/encrypted  compression           lz4                    received
storage/encrypted  atime                 off                    inherited from storage
storage/encrypted  devices               off                    received
storage/encrypted  exec                  on                     default
storage/encrypted  setuid                on                     default
storage/encrypted  readonly              off                    default
storage/encrypted  zoned                 off                    default
storage/encrypted  snapdir               hidden                 default
storage/encrypted  aclinherit            restricted             default
storage/encrypted  createtxg             37060671               -
storage/encrypted  canmount              on                     default
storage/encrypted  xattr                 on                     default
storage/encrypted  copies                1                      default
storage/encrypted  version               5                      -
storage/encrypted  utf8only              on                     -
storage/encrypted  normalization         none                   -
storage/encrypted  casesensitivity       sensitive              -
storage/encrypted  vscan                 off                    default
storage/encrypted  nbmand                off                    default
storage/encrypted  sharesmb              off                    default
storage/encrypted  refquota              none                   default
storage/encrypted  refreservation        none                   default
storage/encrypted  guid                  5275620463138722690    -
storage/encrypted  primarycache          all                    inherited from storage
storage/encrypted  secondarycache        all                    default
storage/encrypted  usedbysnapshots       209K                   -
storage/encrypted  usedbydataset         3.11M                  -
storage/encrypted  usedbychildren        1.28T                  -
storage/encrypted  usedbyrefreservation  0B                     -
storage/encrypted  logbias               throughput             received
storage/encrypted  objsetid              4519                   -
storage/encrypted  dedup                 off                    default
storage/encrypted  mlslabel              none                   default
storage/encrypted  sync                  standard               default
storage/encrypted  dnodesize             legacy                 default
storage/encrypted  refcompressratio      2.35x                  -
storage/encrypted  written               0                      -
storage/encrypted  logicalused           1.35T                  -
storage/encrypted  logicalreferenced     5.72M                  -
storage/encrypted  volmode               default                default
storage/encrypted  filesystem_limit      none                   default
storage/encrypted  snapshot_limit        none                   default
storage/encrypted  filesystem_count      none                   default
storage/encrypted  snapshot_count        none                   default
storage/encrypted  snapdev               hidden                 default
storage/encrypted  acltype               off                    default
storage/encrypted  context               none                   default
storage/encrypted  fscontext             none                   default
storage/encrypted  defcontext            none                   default
storage/encrypted  rootcontext           none                   default
storage/encrypted  relatime              off                    default
storage/encrypted  redundant_metadata    all                    default
storage/encrypted  overlay               off                    default
storage/encrypted  encryption            aes-256-gcm            -
storage/encrypted  keylocation           prompt                 local
storage/encrypted  keyformat             passphrase             -
storage/encrypted  pbkdf2iters           342K                   -
storage/encrypted  encryptionroot        storage/encrypted      -
storage/encrypted  keystatus             available              -
storage/encrypted  special_small_blocks  0                      default
lundman commented 4 years ago

I'm not aware of any send issues in 1.9.3.1 - and I don't think ZOL has had any raw send fixes "recently"?

PaulGrandperrin commented 4 years ago

I grepped through all the recent issues here and yes I guess it might be a new bug. Also, there are two more details that might help you:

behlendorf commented 4 years ago

I'm not aware of any outstanding raw send/recv issues either, but something abnormal clearly occurred. I also don't see anything clearly out of order in the properties you've posted, but perhaps @tcaputi will.

You mentioned you'd sent previous raw incrementals successfully. Do I understand correctly that was done between MacOS and Linux, and you were able to successfully mount them? Was there anything different you can think of with the latest incrementals?

PaulGrandperrin commented 4 years ago

Yes, I checked all the dates of the commands I typed in the past and I can confirm that the previous snapshots were successfully received with the exact same version of ZOL on the receiver (Debian). However, I now remember something that might be very important:

2 other things that might be of interest:

tcaputi commented 4 years ago

I'm a little confused about the timeline. Would you mind laying out everything relevant that happened as a chronological bullet list of events or something similar? The biggest things I'm looking for are:

It's alright if you dont have all of this information, but it would be helpful for me if you could lay things out chronologically as best you can, if you don't mind.

h1z1 commented 4 years ago

zpool history might help.

PaulGrandperrin commented 4 years ago

Yes, I haven't had the time to repost a reorganized post as @tcaputi asked but I'll do it :-) I already know about zpool history, I just hadn't had the time yet!

PaulGrandperrin commented 4 years ago

Hi again! The confinement gave me time to look into this issue again :-)

Source of the bug

@tcaputi you asked me to better summarize the history of what happened but after looking at it, I really think the main takeaway is that the corruption happened when the send was executed on openZFS 1.9.3.1,64 and received on zfsonlinux 0.8.2-3~bpo10+1. All the other details I talked about don't really matter in the context of this bug.

Impossible to even delete the corrupted volumes

Anyway, I decided to not try to recover my corrupted volumes and instead delete them and restore my backups.

I deleted the storage/encrypted/veronique (that you can see in my previous posts) a few weeks ago and it went well, but now trying to remove the remaining corrupted volumes triggers new errors, see bellow.

Moving forward

# zpool status -v
  pool: storage
 state: ONLINE
status: One or more devices has experienced an error resulting in data
    corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
    entire pool from backup.
   see: http://zfsonlinux.org/msg/ZFS-8000-8A
  scan: scrub repaired 0B in 0 days 11:30:14 with 0 errors on Sun Feb  9 11:54:17 2020
config:

    NAME                              STATE     READ WRITE CKSUM
    storage                           ONLINE       0     0     0
      raidz1-0                        ONLINE       0     0     0
        sde1                          ONLINE       0     0     0
        sdf1                          ONLINE       0     0     0
        sdc1                          ONLINE       0     0     0
        sdd1                          ONLINE       0     0     0
    logs
      mirror-1                        ONLINE       0     0     0
        wwn-0x55cd2e404b6ea37d-part1  ONLINE       0     0     0
        wwn-0x55cd2e404b6ea368-part1  ONLINE       0     0     0
    cache
      sdb2                            ONLINE       0     0     0
      sda2                            ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

        storage/encrypted/paulg:<0x0>
        storage/encrypted/paulg@2019:09:19-13:27:<0x0>
        <0x1077>:<0x0>

# zfs destroy storage/encrypted -r
cannot destroy snapshot storage/encrypted/paulg@2019:08:20-19:35: dataset is busy
cannot destroy 'storage/encrypted': dataset already exists

# zfs get all storage/encrypted/paulg@2019:08:20-19:35
NAME                                      PROPERTY              VALUE                  SOURCE
storage/encrypted/paulg@2019:08:20-19:35  type                  snapshot               -
storage/encrypted/paulg@2019:08:20-19:35  creation              Tue Aug 20 21:35 2019  -
storage/encrypted/paulg@2019:08:20-19:35  used                  418K                   -
storage/encrypted/paulg@2019:08:20-19:35  referenced            907G                   -
storage/encrypted/paulg@2019:08:20-19:35  compressratio         1.06x                  -
storage/encrypted/paulg@2019:08:20-19:35  devices               off                    inherited from storage/encrypted
storage/encrypted/paulg@2019:08:20-19:35  exec                  on                     default
storage/encrypted/paulg@2019:08:20-19:35  setuid                on                     default
storage/encrypted/paulg@2019:08:20-19:35  createtxg             37239631               -
storage/encrypted/paulg@2019:08:20-19:35  xattr                 on                     default
storage/encrypted/paulg@2019:08:20-19:35  version               5                      -
storage/encrypted/paulg@2019:08:20-19:35  utf8only              on                     -
storage/encrypted/paulg@2019:08:20-19:35  normalization         none                   -
storage/encrypted/paulg@2019:08:20-19:35  casesensitivity       sensitive              -
storage/encrypted/paulg@2019:08:20-19:35  nbmand                off                    default
storage/encrypted/paulg@2019:08:20-19:35  guid                  9917501289659667102    -
storage/encrypted/paulg@2019:08:20-19:35  primarycache          all                    inherited from storage
storage/encrypted/paulg@2019:08:20-19:35  secondarycache        all                    default
storage/encrypted/paulg@2019:08:20-19:35  defer_destroy         off                    -
storage/encrypted/paulg@2019:08:20-19:35  userrefs              1                      -
storage/encrypted/paulg@2019:08:20-19:35  objsetid              389                    -
storage/encrypted/paulg@2019:08:20-19:35  mlslabel              none                   default
storage/encrypted/paulg@2019:08:20-19:35  refcompressratio      1.06x                  -
storage/encrypted/paulg@2019:08:20-19:35  written               907G                   -
storage/encrypted/paulg@2019:08:20-19:35  clones                                       -
storage/encrypted/paulg@2019:08:20-19:35  logicalreferenced     958G                   -
storage/encrypted/paulg@2019:08:20-19:35  acltype               off                    default
storage/encrypted/paulg@2019:08:20-19:35  context               none                   default
storage/encrypted/paulg@2019:08:20-19:35  fscontext             none                   default
storage/encrypted/paulg@2019:08:20-19:35  defcontext            none                   default
storage/encrypted/paulg@2019:08:20-19:35  rootcontext           none                   default
storage/encrypted/paulg@2019:08:20-19:35  encryption            aes-256-gcm            -
storage/encrypted/paulg@2019:08:20-19:35  encryptionroot        storage/encrypted      -
storage/encrypted/paulg@2019:08:20-19:35  keystatus             unavailable            -

# zfs get all storage/encrypted
NAME               PROPERTY              VALUE                  SOURCE
storage/encrypted  type                  filesystem             -
storage/encrypted  creation              Mon Aug 12 14:17 2019  -
storage/encrypted  used                  1.22T                  -
storage/encrypted  available             81.6G                  -
storage/encrypted  referenced            3.11M                  -
storage/encrypted  compressratio         1.06x                  -
storage/encrypted  mounted               no                     -
storage/encrypted  quota                 none                   default
storage/encrypted  reservation           none                   default
storage/encrypted  recordsize            128K                   default
storage/encrypted  mountpoint            /storage/encrypted     default
storage/encrypted  sharenfs              off                    default
storage/encrypted  checksum              sha512                 received
storage/encrypted  compression           lz4                    received
storage/encrypted  atime                 off                    inherited from storage
storage/encrypted  devices               off                    received
storage/encrypted  exec                  on                     default
storage/encrypted  setuid                on                     default
storage/encrypted  readonly              off                    default
storage/encrypted  zoned                 off                    default
storage/encrypted  snapdir               hidden                 default
storage/encrypted  aclinherit            restricted             default
storage/encrypted  createtxg             37060671               -
storage/encrypted  canmount              on                     default
storage/encrypted  xattr                 on                     default
storage/encrypted  copies                1                      default
storage/encrypted  version               5                      -
storage/encrypted  utf8only              on                     -
storage/encrypted  normalization         none                   -
storage/encrypted  casesensitivity       sensitive              -
storage/encrypted  vscan                 off                    default
storage/encrypted  nbmand                off                    default
storage/encrypted  sharesmb              off                    default
storage/encrypted  refquota              none                   default
storage/encrypted  refreservation        none                   default
storage/encrypted  guid                  5275620463138722690    -
storage/encrypted  primarycache          all                    inherited from storage
storage/encrypted  secondarycache        all                    default
storage/encrypted  usedbysnapshots       0B                     -
storage/encrypted  usedbydataset         3.11M                  -
storage/encrypted  usedbychildren        1.22T                  -
storage/encrypted  usedbyrefreservation  0B                     -
storage/encrypted  logbias               throughput             received
storage/encrypted  objsetid              4519                   -
storage/encrypted  dedup                 off                    default
storage/encrypted  mlslabel              none                   default
storage/encrypted  sync                  standard               default
storage/encrypted  dnodesize             legacy                 default
storage/encrypted  refcompressratio      2.35x                  -
storage/encrypted  written               3.11M                  -
storage/encrypted  logicalused           1.29T                  -
storage/encrypted  logicalreferenced     5.72M                  -
storage/encrypted  volmode               default                default
storage/encrypted  filesystem_limit      none                   default
storage/encrypted  snapshot_limit        none                   default
storage/encrypted  filesystem_count      none                   default
storage/encrypted  snapshot_count        none                   default
storage/encrypted  snapdev               hidden                 default
storage/encrypted  acltype               off                    default
storage/encrypted  context               none                   default
storage/encrypted  fscontext             none                   default
storage/encrypted  defcontext            none                   default
storage/encrypted  rootcontext           none                   default
storage/encrypted  relatime              off                    default
storage/encrypted  redundant_metadata    all                    default
storage/encrypted  overlay               off                    default
storage/encrypted  encryption            aes-256-gcm            -
storage/encrypted  keylocation           prompt                 local
storage/encrypted  keyformat             passphrase             -
storage/encrypted  pbkdf2iters           342K                   -
storage/encrypted  encryptionroot        storage/encrypted      -
storage/encrypted  keystatus             unavailable            -
storage/encrypted  special_small_blocks  0                      default

I guess the storage/encrypted/veronique volume that I deleted a few weeks ago is also still there and referred as <0x1077>:<0x0> in the zpool status command above.

stale[bot] commented 3 years ago

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

PaulGrandperrin commented 3 years ago

I moved my data to the cloud and don't use zfs at the moment so i can't do anything to help. I'll close