tarka / xcp

An extended `cp`
GNU General Public License v3.0
745 stars 24 forks source link

Test failures on ZFS (Linux but not FreeBSD) #37

Closed hg closed 11 months ago

hg commented 1 year ago

Hello,

I am experiencing test failures on ZFS filesystems on Linux. This is all running on amd64. All datasets have lots of free space.

Linux on ZFS

$ uname -sr
Linux 6.3.7-arch1-1

$ rustc --version
rustc 1.70.0 (90c541806 2023-05-31) (Arch Linux rust 1:1.70.0-1)
Test output ``` $ cargo test --release Finished release [optimized] target(s) in 0.05s Running unittests src/main.rs (target/release/deps/xcp-448f490b9f67ea08) running 35 tests test os::common::tests::test_extent_merge ... ok test os::linux::tests::test_allocate_file_is_sparse ... ok test os::common::tests::test_copy_range_uspace_large ... ok test os::common::tests::test_copy_bytes_uspace_large ... ok test os::linux::tests::test_copy_bytes_sparse ... ok test os::linux::tests::test_empty_extent ... FAILED test os::linux::tests::test_extent_fetch ... FAILED test os::linux::tests::test_copy_range_middle ... ok test os::linux::tests::test_extent_fetch_many ... FAILED test os::linux::tests::test_sparse_detection ... ok test os::linux::tests::test_sparse_copy_middle ... ok test vendor::threadpool::test::test_active_count ... ignored test vendor::threadpool::test::test_clone ... ignored test vendor::threadpool::test::test_cloned_eq ... ignored test vendor::threadpool::test::test_debug ... ignored test vendor::threadpool::test::test_empty_pool ... ignored test vendor::threadpool::test::test_join_wavesurfer ... ignored test vendor::threadpool::test::test_massive_task_creation ... ignored test vendor::threadpool::test::test_multi_join ... ignored test vendor::threadpool::test::test_name ... ignored test vendor::threadpool::test::test_no_fun_or_joy ... ignored test vendor::threadpool::test::test_recovery_from_subtask_panic ... ignored test vendor::threadpool::test::test_repeate_join ... ignored test vendor::threadpool::test::test_send ... ignored test vendor::threadpool::test::test_send_shared_data ... ignored test vendor::threadpool::test::test_set_num_threads_decreasing ... ignored test vendor::threadpool::test::test_set_num_threads_increasing ... ignored test vendor::threadpool::test::test_should_not_panic_on_drop_if_subtasks_panic_after_drop ... ignored test vendor::threadpool::test::test_shrink ... ignored test vendor::threadpool::test::test_sync_shared_data ... ignored test vendor::threadpool::test::test_works ... ignored test vendor::threadpool::test::test_zero_tasks_panic ... ignored test os::linux::tests::test_sparse_rust_seek ... ok test os::linux::tests::test_lseek_data ... ok test os::linux::tests::test_lseek_no_data ... ok failures: ---- os::linux::tests::test_empty_extent stdout ---- Error: Operation not supported (os error 95) ---- os::linux::tests::test_extent_fetch stdout ---- Error: Operation not supported (os error 95) ---- os::linux::tests::test_extent_fetch_many stdout ---- Error: Operation not supported (os error 95) failures: os::linux::tests::test_empty_extent os::linux::tests::test_extent_fetch os::linux::tests::test_extent_fetch_many test result: FAILED. 11 passed; 3 failed; 21 ignored; 0 measured; 0 filtered out; finished in 0.01s error: test failed, to rerun pass `--bin xcp` ```

FreeBSD on ZFS

FreeBSD ignores Linux-specific tests, so the whole test suite runs fine.

$ uname -a
FreeBSD bsd 13.2-RELEASE FreeBSD 13.2-RELEASE releng/13.2-n254617-525ecfdad597 GENERIC amd64

$ rustc --version
rustc 1.68.2 (9eb3afe9e 2023-03-27) (built from a source tarball)
Test output ``` $ cargo test --release warning: unused variable: `file` --> tests/util.rs:170:24 | 170 | pub fn probably_sparse(file: &Path) -> Result { | ^^^^ help: if this is intentional, prefix it with an underscore: `_file` | = note: `#[warn(unused_variables)]` on by default warning: `xcp` (test "util") generated 1 warning warning: `xcp` (test "common") generated 1 warning (1 duplicate) Finished release [optimized] target(s) in 0.06s Running unittests src/main.rs (target/release/deps/xcp-d14a3602a95775bb) running 24 tests test os::common::tests::test_copy_range_uspace_large ... ok test os::common::tests::test_copy_bytes_uspace_large ... ok test vendor::threadpool::test::test_active_count ... ignored test vendor::threadpool::test::test_clone ... ignored test vendor::threadpool::test::test_cloned_eq ... ignored test vendor::threadpool::test::test_debug ... ignored test vendor::threadpool::test::test_empty_pool ... ignored test vendor::threadpool::test::test_join_wavesurfer ... ignored test vendor::threadpool::test::test_massive_task_creation ... ignored test vendor::threadpool::test::test_multi_join ... ignored test vendor::threadpool::test::test_name ... ignored test vendor::threadpool::test::test_no_fun_or_joy ... ignored test vendor::threadpool::test::test_recovery_from_subtask_panic ... ignored test vendor::threadpool::test::test_repeate_join ... ignored test vendor::threadpool::test::test_send ... ignored test vendor::threadpool::test::test_send_shared_data ... ignored test vendor::threadpool::test::test_set_num_threads_decreasing ... ignored test vendor::threadpool::test::test_set_num_threads_increasing ... ignored test vendor::threadpool::test::test_should_not_panic_on_drop_if_subtasks_panic_after_drop ... ignored test vendor::threadpool::test::test_shrink ... ignored test vendor::threadpool::test::test_sync_shared_data ... ignored test vendor::threadpool::test::test_works ... ignored test vendor::threadpool::test::test_zero_tasks_panic ... ignored test os::common::tests::test_extent_merge ... ok test result: ok. 3 passed; 0 failed; 21 ignored; 0 measured; 0 filtered out; finished in 0.00s Running tests/common.rs (target/release/deps/common-5ed6983ca092d56e) running 58 tests test basic_help ... ok test copy_all_dirs::test_with_parallel_block_driver ... ok test copy_all_dirs::test_with_parallel_file_driver ... ok test copy_all_dirs_rel::test_with_parallel_block_driver ... ok test copy_all_dirs_rel::test_with_parallel_file_driver ... ok test copy_dirs_files::test_with_parallel_block_driver ... ok test copy_dirs_files::test_with_parallel_file_driver ... ok test copy_dirs_overwrites::test_with_parallel_block_driver ... ok test copy_dirs_overwrites::test_with_parallel_file_driver ... ok test copy_dirs_overwrites_no_target_dir ... ok test copy_empty_dir::test_with_parallel_block_driver ... ok test copy_generated_tree::test_with_parallel_block_driver ... ignored test copy_generated_tree::test_with_parallel_file_driver ... ignored test copy_pattern_no_glob::test_with_parallel_block_driver ... ok test copy_empty_dir::test_with_parallel_file_driver ... ok test copy_pattern_no_glob::test_with_parallel_file_driver ... ok test copy_with_glob::test_with_parallel_block_driver ... ok test dest_file_exists::test_with_parallel_block_driver ... ok test copy_with_glob::test_with_parallel_file_driver ... ok test dest_file_exists::test_with_parallel_file_driver ... ok test dest_file_exists_noclobber::test_with_parallel_block_driver ... ok test dest_file_exists_noclobber::test_with_parallel_file_driver ... ok test dest_file_exists_overwrites::test_with_parallel_block_driver ... ok test dest_file_exists_overwrites::test_with_parallel_file_driver ... ok test dest_file_in_dir_exists::test_with_parallel_block_driver ... ok test dir_copy_containing_symlinks::test_with_parallel_block_driver ... ok test dest_file_in_dir_exists::test_with_parallel_file_driver ... ok test dir_copy_to_nonexistent_is_rename::test_with_parallel_block_driver ... ok test dir_copy_containing_symlinks::test_with_parallel_file_driver ... ok test dir_copy_with_hidden_dir::test_with_parallel_block_driver ... ok test dir_copy_to_nonexistent_is_rename::test_with_parallel_file_driver ... ok test dir_copy_with_hidden_dir::test_with_parallel_file_driver ... ok test dir_overwrite_with_noclobber::test_with_parallel_block_driver ... ok test dir_overwrite_with_noclobber::test_with_parallel_file_driver ... ok test dir_with_gitignore::test_with_parallel_block_driver ... ok test dir_with_gitignore::test_with_parallel_file_driver ... ok test file_copy::test_with_parallel_block_driver ... ok test file_copy::test_with_parallel_file_driver ... ok test file_copy_multiple::test_with_parallel_block_driver ... ok test file_copy_no_perms::test_with_parallel_block_driver ... ok test file_copy_multiple::test_with_parallel_file_driver ... ok test file_copy_no_perms::test_with_parallel_file_driver ... ok test file_copy_perms::test_with_parallel_block_driver ... ok test file_copy_perms::test_with_parallel_file_driver ... ok test file_copy_rel::test_with_parallel_block_driver ... ok test glob_pattern_error::test_with_parallel_block_driver ... ok test file_copy_rel::test_with_parallel_file_driver ... ok test no_args ... ok test glob_pattern_error::test_with_parallel_file_driver ... ok test same_file_no_overwrite::test_with_parallel_block_driver ... ok test same_file_no_overwrite::test_with_parallel_file_driver ... ok test source_missing::test_with_parallel_block_driver ... ok test source_missing::test_with_parallel_file_driver ... ok test source_missing_globbed::test_with_parallel_block_driver ... ok test source_missing_globbed::test_with_parallel_file_driver ... ok test source_same_as_dest::test_with_parallel_file_driver ... ok test source_same_as_dest::test_with_parallel_block_driver ... ok test util::test_hasher ... ok test result: ok. 56 passed; 0 failed; 2 ignored; 0 measured; 0 filtered out; finished in 0.05s Running tests/linux.rs (target/release/deps/linux-f36e08c5ed8f0175) running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s Running tests/util.rs (target/release/deps/util-b0c79de23c2e1bbb) running 1 test test test_hasher ... ok test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s ```

FreeBSD on UFS

This is using a "classic" filesystem just for comparison's sake.

Test output ``` $ cargo test --release running 24 tests test os::common::tests::test_copy_range_uspace_large ... ok test os::common::tests::test_copy_bytes_uspace_large ... ok test vendor::threadpool::test::test_active_count ... ignored test vendor::threadpool::test::test_clone ... ignored test vendor::threadpool::test::test_cloned_eq ... ignored test vendor::threadpool::test::test_debug ... ignored test vendor::threadpool::test::test_empty_pool ... ignored test vendor::threadpool::test::test_join_wavesurfer ... ignored test os::common::tests::test_extent_merge ... ok test vendor::threadpool::test::test_massive_task_creation ... ignored test vendor::threadpool::test::test_multi_join ... ignored test vendor::threadpool::test::test_name ... ignored test vendor::threadpool::test::test_no_fun_or_joy ... ignored test vendor::threadpool::test::test_recovery_from_subtask_panic ... ignored test vendor::threadpool::test::test_repeate_join ... ignored test vendor::threadpool::test::test_send ... ignored test vendor::threadpool::test::test_send_shared_data ... ignored test vendor::threadpool::test::test_set_num_threads_decreasing ... ignored test vendor::threadpool::test::test_set_num_threads_increasing ... ignored test vendor::threadpool::test::test_should_not_panic_on_drop_if_subtasks_panic_after_drop ... ignored test vendor::threadpool::test::test_shrink ... ignored test vendor::threadpool::test::test_sync_shared_data ... ignored test vendor::threadpool::test::test_works ... ignored test vendor::threadpool::test::test_zero_tasks_panic ... ignored test result: ok. 3 passed; 0 failed; 21 ignored; 0 measured; 0 filtered out; finished in 0.00s running 58 tests test basic_help ... ok test copy_all_dirs::test_with_parallel_block_driver ... ok test copy_all_dirs::test_with_parallel_file_driver ... ok test copy_all_dirs_rel::test_with_parallel_block_driver ... ok test copy_all_dirs_rel::test_with_parallel_file_driver ... ok test copy_dirs_files::test_with_parallel_block_driver ... ok test copy_dirs_files::test_with_parallel_file_driver ... ok test copy_dirs_overwrites::test_with_parallel_block_driver ... ok test copy_dirs_overwrites::test_with_parallel_file_driver ... ok test copy_empty_dir::test_with_parallel_block_driver ... ok test copy_dirs_overwrites_no_target_dir ... ok test copy_generated_tree::test_with_parallel_block_driver ... ignored test copy_generated_tree::test_with_parallel_file_driver ... ignored test copy_pattern_no_glob::test_with_parallel_block_driver ... ok test copy_empty_dir::test_with_parallel_file_driver ... ok test copy_pattern_no_glob::test_with_parallel_file_driver ... ok test copy_with_glob::test_with_parallel_block_driver ... ok test dest_file_exists::test_with_parallel_block_driver ... ok test copy_with_glob::test_with_parallel_file_driver ... ok test dest_file_exists::test_with_parallel_file_driver ... ok test dest_file_exists_noclobber::test_with_parallel_block_driver ... ok test dest_file_exists_noclobber::test_with_parallel_file_driver ... ok test dest_file_exists_overwrites::test_with_parallel_block_driver ... ok test dest_file_in_dir_exists::test_with_parallel_block_driver ... ok test dest_file_exists_overwrites::test_with_parallel_file_driver ... ok test dest_file_in_dir_exists::test_with_parallel_file_driver ... ok test dir_copy_containing_symlinks::test_with_parallel_block_driver ... ok test dir_copy_containing_symlinks::test_with_parallel_file_driver ... ok test dir_copy_to_nonexistent_is_rename::test_with_parallel_block_driver ... ok test dir_copy_to_nonexistent_is_rename::test_with_parallel_file_driver ... ok test dir_copy_with_hidden_dir::test_with_parallel_block_driver ... ok test dir_copy_with_hidden_dir::test_with_parallel_file_driver ... ok test dir_overwrite_with_noclobber::test_with_parallel_block_driver ... ok test dir_overwrite_with_noclobber::test_with_parallel_file_driver ... ok test dir_with_gitignore::test_with_parallel_block_driver ... ok test file_copy::test_with_parallel_block_driver ... ok test dir_with_gitignore::test_with_parallel_file_driver ... ok test file_copy::test_with_parallel_file_driver ... ok test file_copy_multiple::test_with_parallel_block_driver ... ok test file_copy_no_perms::test_with_parallel_block_driver ... ok test file_copy_multiple::test_with_parallel_file_driver ... ok test file_copy_no_perms::test_with_parallel_file_driver ... ok test file_copy_perms::test_with_parallel_block_driver ... ok test file_copy_rel::test_with_parallel_block_driver ... ok test file_copy_perms::test_with_parallel_file_driver ... ok test glob_pattern_error::test_with_parallel_block_driver ... ok test file_copy_rel::test_with_parallel_file_driver ... ok test glob_pattern_error::test_with_parallel_file_driver ... ok test no_args ... ok test same_file_no_overwrite::test_with_parallel_block_driver ... ok test same_file_no_overwrite::test_with_parallel_file_driver ... ok test source_missing::test_with_parallel_block_driver ... ok test source_missing::test_with_parallel_file_driver ... ok test source_missing_globbed::test_with_parallel_block_driver ... ok test source_missing_globbed::test_with_parallel_file_driver ... ok test source_same_as_dest::test_with_parallel_block_driver ... ok test source_same_as_dest::test_with_parallel_file_driver ... ok test util::test_hasher ... ok test result: ok. 56 passed; 0 failed; 2 ignored; 0 measured; 0 filtered out; finished in 0.09s running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s running 1 test test test_hasher ... ok test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s ```

If you don't have a ZFS-capable system at hand, I'll take a closer look at it later (although my Rust knowledge is quite rusty). Let's file a bug for now anyway.

tarka commented 1 year ago

Thanks for the report. I've just released v0.11.0, which should fix this. Can you test please?

hg commented 1 year ago

Unfortunately, now it fails with a different set of errors:

https://github.com/hg/xcp/actions/runs/5856336662/job/15875829283#step:11:137

I also tested it on two bare-metal hosts running Ubuntu 22.04 and Arch, with similar results.

Here is the commit if you're interested in testing more filesystems than just ext4:

https://github.com/hg/xcp/commit/00ec909a2cc573841b002e4df97ae43333f82de7

tarka commented 1 year ago

Thanks for that test config, I was going to look at implementing that today so that's a huge time-saver.

Those tests assume that they're running on a standard Linux FS, not an unsupported one, so they should really fail at this point. What's needed in the testing is a way to let them know what FS they are running on and adjust accordingly. Unfortunately there isn't really a method of reliably doing this at runtime (see #24), but it could be added to the test config. I'll cherry-pick your testing into the tree and see what I can come up with.

dsully commented 1 year ago

Does this also imply that xcp doesn't "support" ZFS?

tarka commented 1 year ago

Not currently. Any ZFS support is likely to be tricky, as it frequently lacks support for otherwise standard system-calls (e.g. fiemaps in this case). The purpose of xcp (as far as there is one) is to see what we can do with the latest kernels and hardware, so there are likely to be occasional issues. I try and fall-back to better supported system-calls, but clearly that needs some work ATM.

(It's worth noting here that xcp itself is unsupported, and is stated as being not-for-production in the readme.)

tarka commented 11 months ago

@dsully xcp is now tested against ZFS, so should work going forward. Thanks to @hg for the test configuration.

dsully commented 11 months ago

Great, thank you!