Closed pwolny closed 8 years ago
@AndCycle you'd need to do something like cp foo foo_1 && mv foo_1 foo, and this wouldn't fix it in any of the snapshots you have. If you applied #4833 and turned on the ignore_hole_birth tunable, then did a send|recv the resulting copy should not have this particular flaw.
I'd probably wait until after illumos #7176 has a fix available+merged to bother, though.
@rincebrain thanks for your detailed explanation, looks like there are more issue wait to be discovered.
We'll get bc77ba7 from the master branch pulled in for 0.6.5.8.
Also recommend picking up
https://github.com/zfsonlinux/zfs/commit/463a8cfe2b293934edd2ee79115b20c4598353d6
Illumos 6844 - dnode_next_offset can detect fictional holes · zfsonlinux/zfs@463a8cfhttps://github.com/zfsonlinux/zfs/commit/463a8cfe2b293934edd2ee79115b20c4598353d6 github.com 6844 dnode_next_offset can detect fictional holes Reviewed by: Matthew Ahrens Reviewed by: George Wilson Reviewed by: Brian Behlendorf
From: Brian Behlendorf notifications@github.com Sent: Tuesday, July 12, 2016 12:49:12 PM To: zfsonlinux/zfs Cc: Boris Protopopov; Mention Subject: Re: [zfsonlinux/zfs] Silently corrupted file in snapshots after send/receive (#4809)
We'll get bc77ba7https://github.com/zfsonlinux/zfs/commit/bc77ba73fec82d37c0b57949ec29edd9aa207677 from the master branch pulled in for 0.6.5.8.
You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/zfsonlinux/zfs/issues/4809#issuecomment-232106427, or mute the threadhttps://github.com/notifications/unsubscribe/ACX4uQ-OHIe-aowRR6uO8loA4ZJ_fwx1ks5qU8WIgaJpZM4I-nnL.
I found some time to finish testing.
It seems that with ZFS 0.6.5-329_g5c27b29 and SPL 0.6.5-63_g5ad98ad I get different behaviors if I send /receive the pool with the original problematic file ("/nas1FS/backup/.zfs/snapshot/20160618/samba_share/a/home/bak/aa/wx/wxWidgets-2.8.12/additions/lib/vc_lib/wxmsw28ud_propgrid.pdb") and the pool generated with the test script.
So it seems that the symptom of the original problem solved itself with ZFS 0.6.5-329_g5c27b29 and SPL 0.6.5-63_g5ad98ad. That is somewhat baffling.
Maybe now this is superfluous information but I have located zfs versions when the test script changes behavior and gives me slightly different outputs. with "zfs 0.6.4-53 - 3df293404a102398445fc013b67250073db9004e - Fix type mismatch on 32-bit systems" I get "test_matrix_output.txt" :
oooooEEEEEE
oooooEEEEEE
oooooEEEEEE
oooooEEEEEE
oooooEEEEEE
with "zfs 0.6.4-54 – f1512ee61e2f22186ac16481a09d86112b2d6788 - Illumos 5027 - zfs large block support" I get "test_matrix_output.txt":
EEEEEEEEEEE
oooooEEEEEE
oooooEEEEEE
oooooEEEEEE
oooooEEEEEE
(In the above matrix sleep time increases from left to right and pool size increases from top to bottom)
@bprotopopov I have tested also with sync commands added to the test script and I did not see any change in behavior (same error footprints or same lack of errors for certain versions of zfs/spl).
By the way is there an easier method to obtain an ID of single file for zdb than doing something like that:
zdb -dddd nas1FS/backup |grep -e "ZFS plain file" -e "path" | grep "home/bak/aa/wx/wxWidgets-2.8.12/additions/lib/vc_lib/wxmsw28ud_propgrid.pdb" -B 5
sorry to ask that here, but maybe I am missing something obvious?
Thanks for the help, I really appreciate all the work that was put into the zfsonlinux.
Hi, @pwolny,
sorry, I don't know of an easier way to find the dnode number other than grepping for name in the zdb output. I have noticed however, that the first dnode number in ZFS filesystems that is allocated for regular files is 7. So, if you have one file in a filesystem, it's #7. Also, if you know the order of file creation, the subsequent dnode numbers are going to be 8, 9, ...
I admit I am not quite following the description of the test results in your message. Do I understand if correctly that the datasets created by the code with the fix for the metadata holes (illumos 6315) are OK when transmitted with zfs send (with or without the fix, does not matter), but datasets that were created by the code without such fix show errors (again, when transmitted using either version of the code)? That is expected, as the bug reflects itself in the on-disk data, not in the zfs send code - once the data is written out incorrectly, then unless you turn off the hole_birth optimization (the patch with the tunable discussed earlier in this thread), zfs send will not transmit the holes properly.
Is that the third case you are describing ? That you were able to transmit broken on-disk data correctly with the patch that disabled the hole_birth optimization ? If so, this is as expected.
Boris.
From: pwolny notifications@github.com Sent: Tuesday, July 12, 2016 8:10:43 PM To: zfsonlinux/zfs Cc: Boris Protopopov; Mention Subject: Re: [zfsonlinux/zfs] Silently corrupted file in snapshots after send/receive (#4809)
I found some time to finish testing.
It seems that with ZFS 0.6.5-329_g5c27b29 and SPL 0.6.5-63_g5ad98ad I get different behaviors if I send /receive the pool with the original problematic file ("/nas1FS/backup/.zfs/snapshot/20160618/samba_share/a/home/bak/aa/wx/wxWidgets-2.8.12/additions/lib/vc_lib/wxmsw28ud_propgrid.pdb") and the pool generated with the test script.
So it seems that the symptom of the original problem solved itself with ZFS 0.6.5-329_g5c27b29 and SPL 0.6.5-63_g5ad98ad. That is somewhat baffling.
Maybe now this is superfluous information but I have located zfs versions when the test script changes behavior and gives me slightly different outputs. with "zfs 0.6.4-53 - 3df2934https://github.com/zfsonlinux/zfs/commit/3df293404a102398445fc013b67250073db9004e - Fix type mismatch on 32-bit systems" I get "test_matrix_output.txt" :
oooooEEEEEE oooooEEEEEE oooooEEEEEE oooooEEEEEE oooooEEEEEE
with "zfs 0.6.4-54 - f1512eehttps://github.com/zfsonlinux/zfs/commit/f1512ee61e2f22186ac16481a09d86112b2d6788 - Illumos 5027 - zfs large block support" I get "test_matrix_output.txt":
EEEEEEEEEEE oooooEEEEEE oooooEEEEEE oooooEEEEEE oooooEEEEEE
(In the above matrix sleep time increases from left to right and pool size increases from top to bottom)
@bprotopopovhttps://github.com/bprotopopov I have tested also with sync commands added to the test script and I did not see any change in behavior (same error footprints or same lack of errors for certain versions of zfs/spl).
By the way is there an easier method to obtain an ID of single file for zdb than doing something like that: zdb -dddd nas1FS/backup |grep -e "ZFS plain file" -e "path" | grep "home/bak/aa/wx/wxWidgets-2.8.12/additions/lib/vc_lib/wxmsw28ud_propgrid.pdb" -B 5 sorry to ask that here, but maybe I am missing something obvious?
Thanks for the help, I really appreciate all the work that was put into the zfsonlinux.
You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/zfsonlinux/zfs/issues/4809#issuecomment-232218861, or mute the threadhttps://github.com/notifications/unsubscribe/ACX4uQqvLzbatZTuwO-rZH0CF-960t8jks5qVC0DgaJpZM4I-nnL.
@pwolny at least in random sampling to confirm, object id <=> what is reported as "inode number" in stat() and friends, so you could just parse the output of "stat [path to file]" or call one of the stat family of functions.
Or did you mean something else with your question?
Well that's a good point - should be easy to see in the code if it is by design.
Typos courtesy of my iPhone
On Jul 12, 2016, at 9:14 PM, Rich Ercolani notifications@github.com<mailto:notifications@github.com> wrote:
@pwolnyhttps://github.com/pwolny at least in random sampling to confirm, object id <=> what is reported as "inode number" in stat() and friends, so you could just parse the output of "stat [path to file]" or call one of the stat family of functions.
Or did you mean something else with your question?
You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/zfsonlinux/zfs/issues/4809#issuecomment-232228087, or mute the threadhttps://github.com/notifications/unsubscribe/ACX4ue8dFlnm8NwKHSdjSFbhQnO48OwJks5qVDvkgaJpZM4I-nnL.
Yes! Thanks for the stat command insight. It was exactly what I was looking for.
@bprotopopov Sorry, I will try to clarify. Till now I have assumed that the file "wxmsw28ud_propgrid.pdb" had same underlying problem (hole_birth bug) as the files generated by the test script. Recently I have used same version of zfs ( ZFS 0.6.5-329_g5c27b29 and SPL 0.6.5-63_g5ad98ad) to transfer three pools:
I have not applied any patches disabling the hole_birth optimization, at least to my knowledge. The test pool generated with older (broken) version of zfs transfered incorrectly (using zfs 0.6.5-329_g5c27b29) so I have assumed that this optimization was not present. Am I mistaken in that assumption?
I see. There were two bugs, Illumos 6513 and 6370, and the zfs version you are using to send/recv has the latter bug fixed. That is likely the reason for the positive test case you are asking about.
Boris.
From: pwolny notifications@github.com Sent: Wednesday, July 13, 2016 4:17:49 PM To: zfsonlinux/zfs Cc: Boris Protopopov; Mention Subject: Re: [zfsonlinux/zfs] Silently corrupted file in snapshots after send/receive (#4809)
Yes! Thanks for the stat command insight. It was exactly what I was looking for.
@bprotopopovhttps://github.com/bprotopopov Sorry, I will try to clarify. Till now I have assumed that the file "wxmsw28ud_propgrid.pdb" had same underlying problem (hole_birth bug) as the files generated by the test script. Recently I have used same version of zfs ( ZFS 0.6.5-329_g5c27b29 and SPL 0.6.5-63_g5ad98ad) to transfer three pools:
I have not applied any patches disabling the hole_birth optimization, at least to my knowledge. The test pool generated with older (broken) version of zfs transfered incorrectly (using zfs 0.6.5-329_g5c27b29) so I have assumed that this optimization was not present. Am I mistaken in that assumption?
You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/zfsonlinux/zfs/issues/4809#issuecomment-232473828, or mute the threadhttps://github.com/notifications/unsubscribe/ACX4uUZb5-T7Hxi5I7AI1lBI_wc_eUG9ks5qVUftgaJpZM4I-nnL.
My upgrade path for zfs was v0.6.5.3 -> 0.6.5.7 -> 0.6.5-329_g5c27b29. I have created the test pool (which gives a broken checksum) and the original pool (containing "wxmsw28ud_propgrid.pdb" file) on v0.6.5.3. I kept them both through the upgrades on original hard drive set and I was trying to transfer them to a new disk. Creation of those pools and files happened on v0.6.5.3 so I assumed that data on disk was corrupted. If I am not mistaken the Illumos 6513 and 6370 patch appeared sometime after zfs v0.6.5.5 so I would still expect to get an error checksum on "wxmsw28ud_propgrid.pdb" file when transferring using zfs 0.6.5-329_g5c27b29 ?
329 contains the fix for 6370
Typos courtesy of my iPhone
On Jul 13, 2016, at 6:34 PM, pwolny notifications@github.com<mailto:notifications@github.com> wrote:
My upgrade path for zfs was v0.6.5.3 -> 0.6.5.7 -> 0.6.5-329_g5c27b29. I have created the test pool (which gives a broken checksum) and the original pool (containing "wxmsw28ud_propgrid.pdb" file) on v0.6.5.3. I kept them both through the upgrades on original hard drive set and I was trying to transfer them to a new disk. Creation of those pools and files happened on v0.6.5.3 so I assumed that data on disk was corrupted. If I am not mistaken the Illumos 6513 and 6370 patch appeared sometime after zfs v0.6.5.5 so I would still expect to get an error checksum on "wxmsw28ud_propgrid.pdb" file when transferring using zfs 0.6.5-329_g5c27b29 ?
You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/zfsonlinux/zfs/issues/4809#issuecomment-232506807, or mute the threadhttps://github.com/notifications/unsubscribe/ACX4ub983QxeqRK85tPzaibLhiJvLjjfks5qVWfzgaJpZM4I-nnL.
Ok, so that means that the 6370 bug can be repaired after the fact and it is not related to corrupted data on disk? If so that would certainly explain the observed behavior and it would resolve this issue for me. Thanks for all explanations, it really cleared it up for me.
hi, I am testing this issue with the following patches based on 0.6.5.7, there are still md5 mismatch issues.
I'm not sure if I missed any patches, if so, plz let me know. I have modified above mentioned test script as following, and seems more easily to get this mismatch issue.
#!/bin/bash
# cat zfs_hole_transmit_test.sh, modified by neil for testing
function zfs_snap_send_recv()
{
zfs create -o recordsize=4k p1/hb
zfs set compression=lz4 p1/hb
truncate -s 1G /p1/hb/large_file
#ls /p1/hb/large_file -als
dd if=/dev/urandom of=/p1/hb/large_file bs=4k count=$((($RANDOM%100+3)*128)) seek=$((($RANDOM%100+1)*128)) 2> /dev/null
zfs snapshot p1/hb@$test_count"_1"
truncate -s $((($RANDOM%10+2)*128*4*1024)) /p1/hb/large_file
dd if=/dev/urandom of=/p1/hb/large_file bs=4k count=$(($RANDOM%100+128)) seek=$((3*128)) conv=notrunc 2> /dev/null
dd if=/dev/urandom of=/p1/hb/large_file bs=4k count=10 seek=$((($RANDOM%10+2)*128)) conv=notrunc 2> /dev/null
zfs snapshot p1/hb@$test_count"_2"
zfs send p1/hb@$test_count"_1" | zfs recv p1/backup
zfs send -i p1/hb@$test_count"_1" p1/hb@$test_count"_2" | zfs recv p1/backup
}
function checkmd5()
{
origin_md5=$(md5sum /p1/hb/large_file | awk '{print $1}')
backup_md5=$(md5sum /p1/backup/large_file | awk '{print $1}')
#echo $origin_md5 $backup_md5
if [ "$origin_md5" = "$backup_md5" ]; then
echo "MD5 match, $origin_md5, $backup_md5"
else
echo "MD5 mismatch, hole birth found, exit"
echo "$origin_md5 /p1/hb/large_file"
echo "$backup_md5 /p1/backup/large_file"
exit 1
fi
}
function zfs_clean()
{
zfs destroy p1/backup -r
zfs destroy p1/hb -r
}
zfs_clean
test_count=1
while true
do
echo "Test count:$test_count"
zfs_snap_send_recv
checkmd5
zfs_clean
test_count=$(($test_count+1))
#sleep 1
done
any more information, plz let me know.
Hi Neil Please double check you test procedure, the module versions, etc.
Typos courtesy of my iPhone
On Jul 18, 2016, at 9:06 PM, Neil notifications@github.com<mailto:notifications@github.com> wrote:
hi, I am testing this issue with the following patches based on 0.6.5.7, there are still md5 mismatch issues.
I'm not sure if I missed any patches, if so, plz let me know. I have modified above mentioned test script as following, and seems more easily to get this mismatch issue.
function zfs_snap_send_recv() { zfs create -o recordsize=4k p1/hb zfs set compression=lz4 p1/hb truncate -s 1G /p1/hb/large_file
dd if=/dev/urandom of=/p1/hb/large_file bs=4k count=$((($RANDOM%100+3)_128)) seek=$((($RANDOM%100+1)_128)) 2> /dev/null
zfs snapshot p1/hb@$test_count"_1"
truncate -s $((($RANDOM%10+2)_128_4_1024)) /p1/hb/large_file
dd if=/dev/urandom of=/p1/hb/large_file bs=4k count=$(($RANDOM%100+128)) seek=$((3_128)) conv=notrunc 2> /dev/null
dd if=/dev/urandom of=/p1/hb/large_file bs=4k count=10 seek=$((($RANDOM%10+2)*128)) conv=notrunc 2> /dev/null
zfs snapshot p1/hb@$test_count"_2"
zfs send p1/hb@$test_count"_1" | zfs recv p1/backup
zfs send -i p1/hb@$test_count"_1" p1/hb@$test_count"_2" | zfs recv p1/backup
}
function checkmd5() { origin_md5=$(md5sum /p1/hb/large_file | awk '{print $1}') backup_md5=$(md5sum /p1/backup/large_file | awk '{print $1}')
if [ "$origin_md5" = "$backup_md5" ]; then
echo "MD5 match, $origin_md5, $backup_md5"
else
echo "MD5 mismatch, hole birth found, exit"
echo "$origin_md5 /p1/hb/large_file"
echo "$backup_md5 /p1/backup/large_file"
exit 1
fi
} function zfs_clean() { zfs destroy p1/backup -r zfs destroy p1/hb -r }
zfs_clean test_count=1 while true do echo "Test count:$test_count" zfs_snap_send_recv checkmd5 zfs_clean test_count=$(($test_count+1))
done
any more information, plz let me know.
You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/zfsonlinux/zfs/issues/4809#issuecomment-233503399, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ACX4uUnfGh4GxbCinDunSwzPOuzl6enTks5qXCMOgaJpZM4I-nnL.
Hi, Boris
This test procedure repeated the same pool create / file create / snapshot / transfer cycle in the script. besides the dataset setting in the script, create the pool with:
zpool create p1 sdb -f -o ashift=12 -o autoexpand=on
zpool set listsnapshots=on p1
And the testing environment & module version:
I failed to upload the corrupted files. file corrupted at 0030 0000 - 003F FFFF, data in hb/large_file are all zeros, and in backup/large_files there are corrupted datas.
As I tested, at least 2 mismatch have been found. It will expose with more test. If I have not missed patches, this issue should still exist and not completely fix yet. Besides ignore hole_birth(https://github.com/zfsonlinux/zfs/pull/4833), I have no idea for this issue now. I am glad to help fix this issue if any help needed.
Neil, the reason u asked about the procedure was that I have definitely run my original script, as well as the scripts in the earlier related thread, and I have not observed any issues if the build included the fix for 6513.
Please attach the git log of your repo where you built ZFS and the output of the commands
On your system where your tests have failed (I am assuming of course it is still running the same version of ZFS/SPL).
Boris.
Typos courtesy of my iPhone
On Jul 18, 2016, at 10:56 PM, Neil notifications@github.com<mailto:notifications@github.com> wrote:
Hi, Boris
This test procedure repeated the same pool create / file create / snapshot / transfer cycle in the script. besides the dataset setting in the script, create the pool with:
zpool create p1 sdb -f -o ashift=12 -o autoexpand=on zpool set listsnapshots=on p1
And the testing environment & module version:
You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/zfsonlinux/zfs/issues/4809#issuecomment-233517209, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ACX4uVfU6qtFl85F-GBjh2EJn3qdsT3aks5qXDzwgaJpZM4I-nnL.
Boris, git log of spl and zfs, gitlog.zip and versions root@i-mruehtn9:~# cat /sys/module/zfs/version 0.6.5.7-1 root@i-mruehtn9:~# cat /sys/module/spl/version 0.6.5.7-1
Confirmed, with a 7 second delay between two writes https://gist.github.com/Georgiy-Tugai/b7ae09767527d111ffb9f0a91b3e3bc4
Arch Linux x86-64, linux-lts 4.4.15-1
, zfs-dkms 0.6.5.7-1
, tested on a pool in a 3G file on tmpfs
.
Strangely enough, sparse files in my production data, which was sent/received once (from an almost virgin pool to another virgin pool, both with all features enabled), seems intact as far as I can tell.
I'd appreciate recommendations for mitigating the effect of this bug; should I recreate the pool with hole_birth
disabled?
I also tested with latest master branch, and it still can be reproduced within 0~6 hours. master branch has already merged the patches mentioned above. so we have to focus on this issue beside disable hole_birth.
In order to help you guys,
I need you to follow the build/test procedure I have described earlier in this thread. I can help you with going through the steps I have outlined, if you need help, but I need the info I have requested.
I need to be sure that you are running what you think you are running.
Boris.
From: Georgiy Tugai notifications@github.com Sent: Thursday, July 21, 2016 11:04:33 AM To: zfsonlinux/zfs Cc: Boris Protopopov; Mention Subject: Re: [zfsonlinux/zfs] Silently corrupted file in snapshots after send/receive (#4809)
Confirmed, with a 7 second delay between two writes https://gist.github.com/Georgiy-Tugai/b7ae09767527d111ffb9f0a91b3e3bc4
Arch Linux x86-64, linux-lts 4.4.15-1, zfs-dkms 0.6.5.7-1, tested on a pool in a 3G file on tmpfs.
Strangely enough, sparse files in my production data, which was sent/received once (from an almost virgin pool to another virgin pool, both with all features enabled), seems intact as far as I can tell.
I'd appreciate recommendations for mitigating the effect of this bug; should I recreate the pool with hole_birth disabled?
You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/zfsonlinux/zfs/issues/4809#issuecomment-234282491, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ACX4ufav3jZFNmYvbgfmlDsMvIBMVmEaks5qX4qBgaJpZM4I-nnL.
Neil,
please provide the info I have requested earlier, for the test environment that uses the master branch.
Best regards,
Boris.
From: Neil notifications@github.com Sent: Thursday, July 21, 2016 12:43:16 PM To: zfsonlinux/zfs Cc: Boris Protopopov; Mention Subject: Re: [zfsonlinux/zfs] Silently corrupted file in snapshots after send/receive (#4809)
I also tested with latest master branch, and it still can be reproduced within 0~6 hours. master branch has already merged the patches mentioned above. so we have to focus on this issue beside disable hole_birth.
You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/zfsonlinux/zfs/issues/4809#issuecomment-234312294, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ACX4ua8RWEJQ3UIGjTR02iVm4d0cbILGks5qX6GkgaJpZM4I-nnL.
Hi, @pwolny,
could you please let me know if you still see any issues with missed holes for 0.6.5.7 amended with the fixes for illumos-6513 and illumos-6844 ?
I see @loyou and @Georgiy-Tugai suggest that there are still concerns, but I was not able to confirm so far that they are actually running the right code.
@bprotopopov , sorry for missing the latest information.
and versions root@i-5mbc9uue:~/zfs_sf# cat /sys/module/zfs/version 0.6.5-329_g5c27b29 root@i-5mbc9uue:~/zfs_sf# cat /sys/module/spl/version 0.6.5-63_g5ad98ad
about the question:
I need you to follow the build/test procedure I have described earlier in this thread
I build the spl&zfs with ./autogen.sh && ./configure && make && make install I test with the script mentioned above, which is modified to create filesystem, make a snapshot, create file, truncate with random size, modify with dd, make a snapshot and then send recv. And loop doing this.
@loyou
Can you please do the following - in your script, please log the random offsets and counts that you generate. Then when you see the failure, take those offsets and counts and plug them in your script.
My original script that was modified several times created a hole that covered an L1 region, and it used a combination of writes and truncates for that. Then that L1-size hole was partly overwritten, and the bug (zfs send does not know that the remaining L0 holes in the L1 ranges are new and need to be transmitted) was triggered.
This bug was totally deterministic, and no randomness was needed. When you see the failure with certain offsets, counts, you will be able to reproduce it 100% of time, including the first try.
If you can reproduce those, then please publish the non-random script. If you can't I guess you might be dealing with something else.
Hi, @loyou no that's good info, your test procedure is OK. Here's what would be useful: if you logged the offsets/sizes, truncate boundaries in your script, and if you changed the script to stop and not cleanup on failure (which it looks like it already does).
Then you could use the
#zdb -ddddd pool/fs object
as discussed above to dump the block pointers.
Also, for completeness, can you run the unmodified original script that I used (shown above as well) to see if you can reproduce the 6513 issue in a very simple way ?
zfs create -o recordsize=4k tpool/test_fs
zfs set compression=on tpool/test_fs
truncate -s 1G /tpool/test_fs/large_file
dd if=/dev/urandom of=/tpool/test_fs/large_file bs=4k count=$((3*128)) seek=$((1*128)) oflag=direct
zfs snapshot tpool/test_fs@s1
truncate -s $((2*128*4*1024)) /tpool/test_fs/large_file
dd if=/dev/urandom of=/tpool/test_fs/large_file bs=4k count=128 seek=$((3*128)) conv=notrunc
dd if=/dev/urandom of=/tpool/test_fs/large_file bs=4k count=10 seek=$((2*128)) conv=notrunc
zfs snapshot tpool/test_fs@s2
zfs send tpool/test_fs@s1 | zfs recv tpool/test_fs_copy
zfs send -i tpool/test_fs@s1 tpool/test_fs@s2 | zfs recv tpool/test_fs_copy
md5sum /tpool/test_fs/large_file
md5sum /tpool/test_fs_copy/large_file
that would be very helpful.
@loyou ran your script on my centos 6.7 vm for 7 hours, went through 1110 iterations or so, no failures. I had the zfs/spl master [root@centos-6 zfs-github]# cat /sys/module/{zfs,spl}/version 0.6.5-330_g590c9a0 0.6.5-64_gd2f97b2 just one commit over yours in each repo (seemingly unrelated).
@bprotopopov I have run your script and it cannot reproduce the 6513 issue.
I followed your suggestion and logged the offset&size in the script, and then test with the non-random script. It is not 100% issue, and sometimes it does match.
During the test, I found that with default build "./autogen.sh && ./configure && make && make install", which will not create /etc/zfs/zpool.cache
, in this case, it can reproduce mismatch with high rate. However if /etc/zfs/zpool.cache
is created, it is hard to reproduce(but still can). I am not sure if it make sense.
Because of no zpool.cache, zdb -ddddd cannot be launched. I had made vm server and which has kept current issue, I will send an individual email share the ip&user&password for online checking.
Besides, I will keep on checking it either. Thank u!
@loyou, well I guess I am not sure then that it is 100% the same issue, if it is not deterministic.
If you import the pool, you should be able to run zdb regardless, I believe.
OK, you can contact me offline at bprotopopov@hotmail.com regarding the test VM and will 'script' the term session for the future reference. I will try to get to this over the weekend but cannot guarantee.
@bprotopopov , thanks. this is one of the reproduce case:
zfs create -o recordsize=4k p1/hb
truncate -s 1G /p1/hb/large_file
dd if=/dev/urandom of=/p1/hb/large_file bs=4k count=11264 seek=1152
zfs snapshot p1/hb@2_1
truncate -s 4194304 /p1/hb/large_file
dd if=/dev/urandom of=/p1/hb/large_file bs=4k count=152 seek=384 conv=notrunc
dd if=/dev/urandom of=/p1/hb/large_file bs=4k count=10 seek=1408 conv=notrunc
zfs snapshot p1/hb@2_2
zfs send p1/hb@2_1 | zfs recv p1/backup
zfs send -i p1/hb@2_1 p1/hb@2_2 | zfs recv p1/backup
MD5 mismatch, hole birth found, exit
d0226974fcab038c843250210b96d401 /p1/hb/large_file
f808cdf1979f0eabf19835ea3e09e8e3 /p1/backup/large_file
also I attached the bp dump of p1/hb&p1/backup p1_hb.zdb.ddddd.txt p1_backup.zdb.ddddd.txt
Hi, @loyou,
just ran your script. It works fine on my system. The issue you are seeing on your system, based on the attached zdb output files, does seem like a 6513 problem:
the two L1 ranges at offsets 480000 and 500000 are supposed to be holes with non-zero birth epoch in your source and destination filesystems, yet they are filled with data in your backup, and it looks like they are considered 0-birth epoch holes in your source filesystem.
You can use
zdb -R p1 0:237e9c00:200:di
to dump the L2 metadata block of your source dataset and you will see for sure what the block pointers are.
For the correct L2 on my system, it looks like:
HOLE [L0 unallocated] size=200L birth=0L
HOLE [L0 unallocated] size=200L birth=0L
HOLE [L0 unallocated] size=200L birth=0L
DVA[0]=<3:2f9ada00:1600> DVA[1]=<0:f1f3e00:1600> [L1 ZFS plain file] fletcher4 lz4 LE contiguous unique double size=4000L/1600P birth=111432L/111432P fill=128 cksum=1e14
b0d63f2:54c26d5ac1e69:9ba0888976a4e85:59c1e2dcbc65f21a
DVA[0]=<2:e13e200:600> DVA[1]=<3:2fc86200:600> [L1 ZFS plain file] fletcher4 lz4 LE contiguous unique double size=4000L/600P birth=111432L/111432P fill=24 cksum=69551c75
7e:5c6ed46d7d58:2e7671432ae939:11145422d7e7f630
HOLE [L0 unallocated] size=200L birth=0L
HOLE [L0 unallocated] size=200L birth=0L
HOLE [L0 unallocated] size=200L birth=0L
HOLE [L0 unallocated] size=200L birth=0L
HOLE [L1 ZFS plain file] size=4000L birth=111432L
HOLE [L1 ZFS plain file] size=4000L birth=111432L
DVA[0]=<0:f1d3a00:400> DVA[1]=<1:7f7e600:400> [L1 ZFS plain file] fletcher4 lz4 LE contiguous unique double size=4000L/400P birth=111432L/111432P fill=10 cksum=34e9e2163
2:22aef29e5e10:c2175404e0c4b:2ff61fbf03444b6
...
with the 480000 and 500000 ranges showing
...
HOLE [L1 ZFS plain file] size=4000L birth=111432L
HOLE [L1 ZFS plain file] size=4000L birth=111432L
...
showing holes with non-zero birth epochs (every line is for L1 range of size 80000 hex).
From: Neil notifications@github.com Sent: Monday, July 25, 2016 12:22:10 AM To: zfsonlinux/zfs Cc: Boris Protopopov; Mention Subject: Re: [zfsonlinux/zfs] Silently corrupted file in snapshots after send/receive (#4809)
@bprotopopovhttps://github.com/bprotopopov , thanks. this is one of the reproduce case:
zfs create -o recordsize=4k p1/hb truncate -s 1G /p1/hb/large_file dd if=/dev/urandom of=/p1/hb/large_file bs=4k count=11264 seek=1152 zfs snapshot p1/hb@2_1 truncate -s 4194304 /p1/hb/large_file dd if=/dev/urandom of=/p1/hb/large_file bs=4k count=152 seek=384 conv=notrunc dd if=/dev/urandom of=/p1/hb/large_file bs=4k count=10 seek=1408 conv=notrunc zfs snapshot p1/hb@2_2 zfs send p1/hb@2_1 | zfs recv p1/backup zfs send -i p1/hb@2_1 p1/hb@2_2 | zfs recv p1/backup
MD5 mismatch, hole birth found, exit d0226974fcab038c843250210b96d401 /p1/hb/large_file f808cdf1979f0eabf19835ea3e09e8e3 /p1/backup/large_file
also I attached the bp dump of p1/hb&p1/backup p1_hb.zdb.ddddd.txthttps://github.com/zfsonlinux/zfs/files/380763/p1_hb.zdb.ddddd.txt p1_backup.zdb.ddddd.txthttps://github.com/zfsonlinux/zfs/files/380765/p1_backup.zdb.ddddd.txt
You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/zfsonlinux/zfs/issues/4809#issuecomment-234833274, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ACX4uSaYQ9FJXTq-MKohYn9EPyMiQnQJks5qZDnygaJpZM4I-nnL.
@bprotopopov we can see the metadata of p1/{hb,backup}/large_file, and p1/backup/large_file get 2 more L1 indirect blocks than p1/hb/large_file, which is the corruption region we get.
L2 metadata of p1/hb/large_file, zdb -R p1 0:237e9c00:200:di
p1_hb_L2_0:237e9c00:200:di.txt
HOLE [L0 unallocated] size=200L birth=0L
HOLE [L0 unallocated] size=200L birth=0L
HOLE [L0 unallocated] size=200L birth=0L
DVA[0]=<0:237c5c00:1600> DVA[1]=<0:4e0c6fc00:1600> [L1 ZFS plain file] fletcher4 lz4 LE contiguous unique double size=4000L/1600P birth=111640L/111640P fill=128 cksum=1f43e582eff:5d758297d1932:ac7dbe127589265:cc40479b7a6c1a02
DVA[0]=<0:237df200:600> DVA[1]=<0:4e0c71200:600> [L1 ZFS plain file] fletcher4 lz4 LE contiguous unique double size=4000L/600P birth=111640L/111640P fill=24 cksum=65f68de822:617218e5a156:33dad79632620d:13e30014e5ff9c77
HOLE [L0 unallocated] size=200L birth=0L
HOLE [L0 unallocated] size=200L birth=0L
HOLE [L0 unallocated] size=200L birth=0L
HOLE [L0 unallocated] size=200L birth=0L
HOLE [L0 unallocated] size=200L birth=0L
HOLE [L0 unallocated] size=200L birth=0L
DVA[0]=<0:237e9800:400> DVA[1]=<0:4e0c71800:400> [L1 ZFS plain file] fletcher4 lz4 LE contiguous unique double size=4000L/400P birth=111640L/111640P fill=10 cksum=3c70471c1a:2ab5b049d594:fc66313f88a2b:40d11c9085f6614
HOLE [L0 unallocated] size=200L birth=0L
HOLE [L0 unallocated] size=200L birth=0L
HOLE [L0 unallocated] size=200L birth=0L
...
and L2 metadata of p1/backup/large_file, zdb -R p1 0:2655ec00:200:di
p1_backup_L2_0:2655ec00:200:di.txt
HOLE [L0 unallocated] size=200L birth=0L
HOLE [L0 unallocated] size=200L birth=0L
HOLE [L0 unallocated] size=200L birth=0L
DVA[0]=<0:26545000:1600> DVA[1]=<0:4e0d28000:1600> [L1 ZFS plain file] fletcher4 lz4 LE contiguous unique double size=4000L/1600P birth=111651L/111651P fill=128 cksum=1ea63edcd43:5a811c96f8444:a6539e50ad7e670:4bb8a78beaab38b4
DVA[0]=<0:2655e600:600> DVA[1]=<0:4e0d29600:600> [L1 ZFS plain file] fletcher4 lz4 LE contiguous unique double size=4000L/600P birth=111651L/111651P fill=24 cksum=65446fb211:6076ba7f1844:3324c8e0e621bb:1389c7c570b9cccb
HOLE [L0 unallocated] size=200L birth=0L
HOLE [L0 unallocated] size=200L birth=0L
HOLE [L0 unallocated] size=200L birth=0L
HOLE [L0 unallocated] size=200L birth=0L
DVA[0]=<0:2388e400:1600> DVA[1]=<0:4e0c8f400:1600> [L1 ZFS plain file] fletcher4 lz4 LE contiguous unique double size=4000L/1600P birth=111646L/111646P fill=128 cksum=1db560423dd:59cd5dadfa628:a7c689bdf8d691d:8db5c5a36c3f22e9
DVA[0]=<0:2390fa00:1600> DVA[1]=<0:4e0c90a00:1600> [L1 ZFS plain file] fletcher4 lz4 LE contiguous unique double size=4000L/1600P birth=111646L/111646P fill=128 cksum=1fece6089a8:5dbded8753784:aae5c4b54dd159f:8b6929104e0662ea
DVA[0]=<0:264c4c00:400> DVA[1]=<0:4e0d27c00:400> [L1 ZFS plain file] fletcher4 lz4 LE contiguous unique double size=4000L/400P birth=111651L/111651P fill=10 cksum=3a6257cc44:288124e7f014:ec3d8e9dd5e64:3c21fed6a53e702
HOLE [L1 ZFS plain file] size=4000L birth=111651L
HOLE [L1 ZFS plain file] size=4000L birth=111651L
HOLE [L1 ZFS plain file] size=4000L birth=111651L
HOLE [L1 ZFS plain file] size=4000L birth=111651L
...
We have a zpool which I think is affected by this issue and I'm trying to work out the correct commits to build to resolve this. I am working on top of the zfs-0.6.5.7 tag:
Illumos 6844 - dnode_next_offset can detect fictional holes - 463a8cfe2b293934edd2ee79115b20c4598353d6 OpenZFS 6513 - partially filled holes lose birth time - bc77ba73fec82d37c0b57949ec29edd9aa207677 Skip ctldir znode in zfs_rezget to fix snapdir issues - cbecb4fb228bbfedec02eb0a172c681b4338d08f Want tunable to ignore hole_birth - 31b86019fd1c2a75548d0a3a8cfd0a29023d55b2
root# cat /sys/module/zfs/parameters/ignore_hole_birth 1
Originally we saw this issue because the incremental received stream showed the content diverging between the zvol and the file. Doing some testing now though it seems that the file gets corrupted as soon as we take the snapshot in the source pool. The WriteDiffs.py tool is something we developed to only write changed blocks in a file so we got cow benefits.
root# WriteDiffs.py -c -s /dev/zvol/ztank/reference/xvda2@ref4000 /ztank/reference/config.ext4 Source: /dev/zvol/ztank/reference/xvda2@ref4000; sha256: b1c4d74fe8f18fbdb0f58074d043f9ba818e26a67ec64a835057e763ac099f13 Destination: /ztank/reference/config.ext4; sha256: b1c4d74fe8f18fbdb0f58074d043f9ba818e26a67ec64a835057e763ac099f13 Start Time: 1470734255.66 End Time: 1470734256.41 Run Time: 0.75s Total Compared Blocks: 31232 Matched Blocks: 31232; Unmatched Blocks: 0 Compared 41720.54 blocks/s; 162.97 Mbytes/s
# This execution of WriteDiffs.py did not make any change to the copy of the file at /ztank/reference/config.ext4
root# sha256sum /dev/zvol/ztank/reference/xvda2@ref4000 /ztank/reference/config.ext4 b1c4d74fe8f18fbdb0f58074d043f9ba818e26a67ec64a835057e763ac099f13 /dev/zvol/ztank/reference/xvda2@ref4000 b1c4d74fe8f18fbdb0f58074d043f9ba818e26a67ec64a835057e763ac099f13 /ztank/reference/config.ext4
root# zfs snapshot -r hapseed/reference/filesystems/imageslv@james root# sha256sum /dev/zvol/ztank/reference/xvda2@ref4000 /ztank/reference/config.ext4 /ztank/reference/.zfs/snapshot/ref4000/config.ext4 /ztank/reference/.zfs/snapshot/james/config.ext4 b1c4d74fe8f18fbdb0f58074d043f9ba818e26a67ec64a835057e763ac099f13 /dev/zvol/ztank/reference/xvda2@ref4000 b1c4d74fe8f18fbdb0f58074d043f9ba818e26a67ec64a835057e763ac099f13 /ztank/reference/config.ext4 4b0906ecf1d4b4356acc7b20a05ed4744f38eede72fc1f96fe138932e37b3422 /ztank/reference/.zfs/snapshot/ref4000/config.ext4 4b0906ecf1d4b4356acc7b20a05ed4744f38eede72fc1f96fe138932e37b3422 /ztank/reference/.zfs/snapshot/james/config.ext4
When we send/receive the snapshot both the @ref4000 snapshot and the file in the filesystem are corrupted to the snapshot checksum
root@other-host# sha256sum /dev/zvol/ztank/reference/xvda2@ref4000 /ztank/reference/config.ext4 /ztank/reference/.zfs/snapshot/ref4000/config.ext4 b1c4d74fe8f18fbdb0f58074d043f9ba818e26a67ec64a835057e763ac099f13 /dev/zvol/ztank/reference/xvda2@ref4000 4b0906ecf1d4b4356acc7b20a05ed4744f38eede72fc1f96fe138932e37b3422 /ztank/reference/config.ext4 4b0906ecf1d4b4356acc7b20a05ed4744f38eede72fc1f96fe138932e37b3422 /ztank/reference/.zfs/snapshot/ref4000/config.ext4
@JKDingwall Fascinating.
Unless I'm misrecalling how this works, you shouldn't be able to see this bug without using differential send-recv or parsing ZDB output, because the only thing in ZFS that cares about the incorrect data is the calculation of whether to send holes for differential send (and any data corruption only "happens" on the receiver of the snapshot, because it didn't send some holes).
So, if you're seeing corruption happening before you do any send|recv, I don't think this is the bug in question?
I guess it depends on what WriteDiffs.py is doing.
This zpool is on a master system for our CI system which produces an incremental patch stream each night for that build against our previous release. We started seeing problems in the incremental stream on receive last week and because of how we update the config.ext4 image assumed it was related to this bug. It was only today I looked at the checksums on the master zpool and saw that the file is corrupted in the snapshot before sending. What would the interesting tests or details about the system be to confirm if this is a different issue?
@JKDingwall The biggest thing is that we need to know how WriteDiffs.py works. This bug is in a very specific set of code, and if you're encountering this issue by running userland scripts, there's only a couple ways that could be done.
referencing https://github.com/openzfs/openzfs/pull/173 7176 Yet another hole birth issue [OpenZFS]
https://github.com/zfsonlinux/zfs/pull/4950 OpenZFS 7176 Yet another hole birth issue [PR4950]
@kernelOfTruth, I fixed the build error of your pull request, but I can still reproduce with my script.
@pcd1193182 I have created a gist with the WriteDiffs.py code: https://gist.github.com/JKDingwall/7f7200a99f52fd3c61b734fbde31edee/49b86a26cb3b9cfd1819d7958e7c8808f6fd991e
On our system after a reboot the checksum of the file in the filesystem also corrupted to match the snapshot. I believe however that there was a bug present in the code where the mmap for the output file was not flushed before it was closed. The HEAD of the gist contains the latest version which does this and so far we have not seen the problem again. It would be useful to know if you think that this could have been the cause. If so we have been lucky to have not been caught out before.
@JKDingwall so you are not longer able to reproduce the problem?. Can you confirm if the patch "Want tunable to ignore hole_birth - 31b8601" with the tunnable set to 1 was the one that finally stopped the issue from happening?
@clopez - I can still reproduce the problem. Patches as given above, ignore_hole_birth=1.
Last night I got: zvol sha256sum: db71ec877acc4f2abc6a9e01781df46e3d1e1416cc2561ccea51d25f348b65ba file after WriteDiffs.py: db71ec877acc4f2abc6a9e01781df46e3d1e1416cc2561ccea51d25f348b65ba snapshot of file after WriteDiffs.py: ebc8d6bd78578640839b0cc4f78695bd40eebce2582d28eb32979a712fe19047
I think if I reboot I will see the file corrupt to the snapshot checksum... zvol: db71ec877acc4f2abc6a9e01781df46e3d1e1416cc2561ccea51d25f348b65ba file: ebc8d6bd78578640839b0cc4f78695bd40eebce2582d28eb32979a712fe19047 file snapshot: ebc8d6bd78578640839b0cc4f78695bd40eebce2582d28eb32979a712fe19047
Could it be a cache consistency problem with holes?
@JKDingwall you wrote: "...It was only today I looked at the checksums on the master zpool and saw that the file is corrupted in the snapshot before sending. What would the interesting tests or details about the system be to confirm if this is a different issue?"
If you modify the pool on your master system using zfs send/recv (I mean if you receive into that master pool), then this might be an issue with missed holes. If you do not receive into the master pool and still see some corruption, then it is a different issue.
There are ways of examining the metadata (block pointers) of the suspect files using zdb, as discussed in this thread, that let you see if there were holes missed. But the pre-requisite for this type of bugs to occur is that the dataset is modified with zfs receive.
@bprotopopov - there is no zfs receive involved so I will try to gather further information with a reproducible test case and then open a new issue.
I do not know if it belongs here. I have the script used to test from above. And find the result very strange.
ZFSPOOL='DS1'
zfs destroy $ZFSPOOL/backup -r zfs destroy $ZFSPOOL/hb -r zfs create -o recordsize=4k $ZFSPOOL/hb truncate -s 1G /$ZFSPOOL/hb/large_file dd if=/dev/urandom of=/$ZFSPOOL/hb/large_file bs=4k count=11264 seek=1152 zfs snapshot $ZFSPOOL/hb@2_1 truncate -s 4194304 /$ZFSPOOL/hb/large_file dd if=/dev/urandom of=/$ZFSPOOL/hb/large_file bs=4k count=152 seek=384 conv=notrunc
sleep 5 # If sleep >= 5 seconds is the checksum differently.
dd if=/dev/urandom of=/$ZFSPOOL/hb/large_file bs=4k count=10 seek=1408 conv=notrunc zfs snapshot $ZFSPOOL/hb@2_2 zfs send $ZFSPOOL/hb@2_1 | zfs recv $ZFSPOOL/backup zfs send -i $ZFSPOOL/hb@2_1 $ZFSPOOL/hb@2_2 | zfs recv $ZFSPOOL/backup md5sum /$ZFSPOOL/hb/large_file /$ZFSPOOL/backup/large_file
I have called it on two different systems, more than 50 times. Sleep <5 seconds everything OK> = 5 seconds error.
System 1 Debian 8 ZFS 0.6.5.7-8-jessie System 2 Ubuntu 16.04 ZFS 0.6.5.6-0ubuntu10
@datiscum Unless this is going to become an overarching bug for all hole_birth bugs, this isn't a case of the bug in illumos 6513, but instead of illumos 7176, which has an open PR for ZoL of #4981.
So in short, I think your problem isn't the problem being discussed in this bug, but is a known issue.
Hey @datiscum, while writing test scripts for all the hole_birth bugs found, I found that indeed, your reported bug isn't any of the ones currently in play! So I filed #5030 about it. You might want to subscribe to that to see when it's fixed.
@JKDingwall Presuming your issue hasn't been magically fixed in the interim, could you open a new issue about it, since I don't think your bug is the same bug this thread is about?
Closing. The whole birth optimization has been disabled by default in the 0.6.5.8 release, 3a8e13688b4ee8c1799edb6484943eae5ea90dd2. Work to resolve the various underlying issues is on going in master.
@behlendorf can we also disable it for 0.7.0 and master ?
even though following the discussion, I just noticed that I had forgotten to disable that setting at module load time,
I didn't use zfs send for the last few weeks but intend to do so again in the near future,
for e.g. Gentoo or other bleeding edge users who aren't aware of details this could cause some headaches if there are still issues dormant in the code
Thanks
I have experienced a silently corrupted snapshot by send/receive – a checksum of a single file on source and target do not match on the filesystem and on the snapshots. Repeated scrubs of the tested pools do not show any errors. Trying to replicate the source filesystem results in a modification of single file on the target pool, in the received filesystem and all snapshots. Source pool is a 6x1TB RAIDZ2 array on Debian 8.2.0 (Jessie, kernel 3.16.0-4-amd64, installed from DVDs, no additional updates) with version 0.6.5.3 of ZFS/SPL built from source (standard configure and no patches).
My source pool (“nas1FS’) was created on a non-ECC RAM machine and after filling up with data through a samba share (standard samba server, on the source fs sharesmb=off) was moved to a different computer with ECC RAM (I have used the same operating system on both machines). I can understand non-ECC RAM in the first computer causing permanent corruption of data that is visible on both source and target pool, but in this case the data is only silently changed during the transfer on a new computer with ECC RAM and source pool data seems to be fine. This corruption of data during send/receive is repeatable.
To better explain what I have done: First I have created a 6x1TB raidz2 pool on my old computer ("nas1FS"). After filling this pool with data I have moved the array from old computer to a new one and I have tried to backup the data on “nas1FS” pool to a different pool (“backup_raidz“).
“nas1FS” pool contained following snapshots that are of interest in this issue:
I have created a “backup_raidz” pool for backup (with compression turned on):
# zpool create -o ashift=12 -O compression=gzip-9 backup_raidz raidz1 /dev/disk/by-id/ata-SAMSUNG_HD103UJ_SerialNo1 /dev/disk/by-id/ata-SAMSUNG_HD103UJ_SerialNo2 /dev/disk/by-id/ata-HGST_HTS721010A9E630_SerialNo3 -f
Afterwards I have tried to replicate “nas1FS” pool.
# zfs send -R -vvvv "nas1FS@20160618" |zfs receive -vvvv -F "backup_raidz/nas1FS_bak"
this command finished successfully without any error.I have executed following commands to get a list of file checksums on both source and target:
and compared the resulting files
I have found a checksum mismatch on a single file:
Correct checksum is “3178c03d3205ac148372a71d75a835ec”, it was verified on the source used to populate the “nas1FS” filesystem.
This checksum mismatch was propagated through all snapshots in which the file was present on target pool:
Source pool has shown correct checksum on all snapshot that the offending file was accessible
Trying to access this file on a snapshot when it did not exist (“backup@20151121_1”) results in a “No such file or directory” on target pool (“backup_raidz” or “backup_raidz_test”).
When I have tried to access offending file on “nas1FS” with a command:
# md5sum /nas1FS/backup/.zfs/snapshot/20151121_1/samba_share/a/home/bak/aa/wx/wxWidgets-2.8.12/additions/lib/vc_lib/wxmsw28ud_propgrid.pdb
it resulted in a very hard system lockup, I could not get a system reaction on “Ctrl-Alt-SysRq-h” and similar key combinations, any IO to disks stopped completely and immediately, system stopped responding to ping and only a hard reset achieved any reaction out of the system. After hard reset everything was working, above mentioned file checksum results were unchanged.I have also tried a send/recieve to a different target pool (a single 1TB HGST disk):
# zfs send -R -vvvv "nas1FS/backup@20160618" |zfs receive -vvvv -F "backup_raidz_test/nas1FS_bak"
resulted with same md5sum mismatches.When sending only the latest snapshot with:
# zfs send -vvvv "nas1FS/backup@20160618" |zfs receive -vvvv -F "backup_raidz_test/backup"
I get a correct md5sum on the target filesystem.When trying to do an incremental send receive from the first available snapshot on source pool:
# zfs send -vvvv "nas1FS/backup@20151121_1" |zfs receive -vvvv -F "backup_raidz_test/backup"
Offending file not present on target and source pool and trying to access it on target pool does not cause any issues.# zfs send -vvvv -I "nas1FS/backup@20151121_1" "nas1FS/backup@20151124" |zfs receive -vvvv -F "backup_raidz_test/backup"
I get again a checksum mismatch.When trying to do an incremental send receive from the second available snapshot on source pool I get correct checksums on both snapshots on target pool ....
It is interesting to note that only a single block of 4096 bytes of data is corrupted at the end of the file (that has a size of 321 x 4096 bytes ) and only when transferring data with the first source snapshot ("nas1FS/backup@20151121_1") .
Binary comparison of the offending file:
I have also run zdb on the source pool, commands I did check did not find any errors:
To summarize used system configurations:
System used for “nas1FS” data fillup (old computer): Motherboard MSI X58 Pro (MSI MS-7522/MSI X58 Gold) with Intel Quad Core i7 965 3.2GHz and 14GB non-ECC RAM (MB, CPU and RAM and PS are about 6 years old). “/boot” : INTEL SSDMCEAW080A4 80GB SSD “nas1FS” pool: 6x1TB HGST Travelstar 7K1000 in a RAIDZ2 array (HDDs are ~6months old).
New system to which “nas1FS” was moved (all disks): Motherboard Supermicro A1SA7-2750F (8 core Intel Atom) with 32GB ECC RAM (MB and RAM and PS are new). “/boot” : INTEL SSDMCEAW080A4 80GB SSD “nas1FS” pool: 6x1TB HGST Travelstar 7K1000 in a RAIDZ2 array (moved from the old computer). “backup_raidz” pool: 2x1TB Samsung HD103UJ + 1x1TB HGST Travelstar 7K1000 (pool used for backup) “backup_raidz_test” pool: 1TB Samsung HD103UJ (pool with no parity, for additional tests)
Both systems were tested with memtest, cpuburn etc. without errors. I am using Debian Jessie booted from zfs pool (with a separate boot partition), same operating system on both machines used with “nas1FS” pool.
Kernel Command line:
BOOT_IMAGE=/vmlinuz-3.16.0-4-amd64 root=ZFS=/rootFS/1_jessie ro rpool=nas1FS bootfs=nas1FS/rootFS/1_jessie root=ZFS=nas1FS/rootFS/1_jessie rootfstype=zfs boot=zfs quiet
SPL and ZFS was built from source.
Some excerpts from dmesg (blocked messages are not connected to the hard lockup of the system):
I hope this information will be helpful, but feel free to let me know what other tests I can perform to diagnose this issue. I will be happy to provide any other info.