TraceMachina / nativelink

NativeLink is an open source high-performance build cache and remote execution server, compatible with Bazel, Buck2, Reclient, and other RBE-compatible build systems. It offers drastically faster builds, reduced test flakiness, and specialized hardware.
https://nativelink.com
Apache License 2.0
1.18k stars 109 forks source link

file_continues_to_stream_on_content_replace_test fails #1318

Open aaronmondal opened 2 months ago

aaronmondal commented 2 months ago

Observed in https://github.com/TraceMachina/nativelink/actions/runs/10699751146/job/29662010987?pr=1317

test file_continues_to_stream_on_content_replace_test ... FAILED
test file_gets_cleans_up_on_cache_eviction ... ok
test get_file_size_uses_block_size ... ok
test get_part_is_zero_digest ... ok
test get_part_timeout_test ... ok
test has_with_results_on_zero_digests ... ok
test oldest_entry_evicted_with_access_times_loaded_from_disk ... ok
  2024-09-04T10:35:22.539662Z ERROR nativelink_store::filesystem_store: Failed to rename file, err: Error { code: NotFound, messages: ["The system cannot find the path specified. (os error 3)", "Failed to rename temp file to final path \"C:\\\\Users\\\\RUNNER~1\\\\AppData\\\\Local\\\\Temp\\\\/5713743219120593976/content_path/0123456789abcdef000000000000000000010000000000000123456789abcdef-10\""] }, from_path: "C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\/3946788439344171467/temp_path/0123456789abcdef000000000000000000010000000000001600000000000000-10", final_path: "C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\/5713743219120593976/content_path/0123456789abcdef000000000000000000010000000000000123456789abcdef-10"
    at nativelink-store\src\filesystem_store.rs:702
    in nativelink_store::filesystem_store::filesystem_store_emplace_file

test rename_on_insert_fails_due_to_filesystem_error_proper_cleanup_happens ... ok
test temp_files_get_deleted_on_replace_test ... ok
  2024-09-04T10:35:22.617991Z ERROR nativelink_store::filesystem_store: Failed to delete file, file_path: "C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\/14126721169268000047/temp_path/0123456789abcdef000000000000000000010000000000001c00000000000000-10", err: Error { code: Internal, messages: ["background task failed: JoinError::Cancelled(Id(1343))", "Failed to remove file \"C:\\\\Users\\\\RUNNER~1\\\\AppData\\\\Local\\\\Temp\\\\/14126721169268000047/temp_path/0123456789abcdef000000000000000000010000000000001c00000000000000-10\""] }
    at nativelink-store\src\filesystem_store.rs:124
    in nativelink_store::filesystem_store::filesystem_delete_file

test update_file_future_drops_before_rename ... ok
test update_with_whole_file_closes_file ... ok
test update_with_whole_file_slow_path_when_low_file_descriptors ... ok
test valid_results_after_shutdown_test ... ok

failures:

---- file_continues_to_stream_on_content_replace_test stdout ----
thread 'file_continues_to_stream_on_content_replace_test' panicked at nativelink-store\tests\filesystem_store_test.rs:474:5:
assertion failed: `(left == right)`: Expected file content to match

Diff < left / right > :
<b"123443210"
>[
>    49,
>    50,
>    51,
>    52,
>    53,
>    54,
>    55,
>    56,
>    57,
>]
aaronmondal commented 2 months ago

Observed on linux as well https://github.com/TraceMachina/nativelink/actions/runs/10648367662/job/29517335787?pr=1276

FAIL: //nativelink-store:integration_tests/filesystem_store_test (see /home/runner/.bazel/execroot/_main/bazel-out/k8-fastbuild/testlogs/nativelink-store/integration_tests/filesystem_store_test/test.log)
INFO: From Testing //nativelink-store:integration_tests/filesystem_store_test:
==================== Test output for //nativelink-store:integration_tests/filesystem_store_test:

running 19 tests
  2024-08-31T21:28:22.932260Z ERROR nativelink_store::filesystem_store: Failed to touch file, err: Error { code: NotFound, messages: ["No such file or directory (os error 2)", "Failed to touch file in filesystem store \"/home/runner/.bazel/sandbox/linux-sandbox/689/execroot/_main/_tmp/4570b71ef2441e1548bbbb5c539c5a14/14056172379962841113/content_path/0123456789abcdef000000000000000000010000000000000123456789abcdef-10\""] }
    at nativelink-store/src/filesystem_store.rs:342

  2024-08-31T21:28:22.933086Z  WARN nativelink_store::filesystem_store: Failed to rename file, digest: DigestInfo { size_bytes: 10, hash: "0123456789abcdef000000000000000000010000000000000123456789abcdef" }, from_path: "/home/runner/.bazel/sandbox/linux-sandbox/689/execroot/_main/_tmp/4570b71ef2441e1548bbbb5c539c5a14/14056172379962841113/content_path/0123456789abcdef000000000000000000010000000000000123456789abcdef-10", to_path: "/home/runner/.bazel/sandbox/linux-sandbox/689/execroot/_main/_tmp/4570b71ef2441e1548bbbb5c539c5a14/10083340587202927996/temp_path/0123456789abcdef000000000000000000010000000000000100000000000000-10", err: Error { code: NotFound, messages: ["No such file or directory (os error 2)"] }
    at nativelink-store/src/filesystem_store.rs:369

test deleted_file_removed_from_store ... ok
test eviction_drops_file_test ... ok
test digest_contents_replaced_continues_using_old_data ... ok
test atime_updates_on_get_part_test ... ok
test eviction_on_insert_calls_unref_once ... ok
test file_continues_to_stream_on_content_replace_test ... FAILED
test file_gets_cleans_up_on_cache_eviction ... FAILED
test get_part_timeout_test ... ok
test oldest_entry_evicted_with_access_times_loaded_from_disk ... ok
test get_part_is_zero_digest ... ok
test has_with_results_on_zero_digests ... ok
test get_file_size_uses_block_size ... ok
  2024-08-31T21:28:25.397575Z ERROR nativelink_store::filesystem_store: Failed to rename file, err: Error { code: NotFound, messages: ["No such file or directory (os error 2)", "Failed to rename temp file to final path \"/home/runner/.bazel/sandbox/linux-sandbox/689/execroot/_main/_tmp/4570b71ef2441e1548bbbb5c539c5a14/5840616205448497430/content_path/0123456789abcdef000000000000000000010000000000000123456789abcdef-10\""] }, from_path: "/home/runner/.bazel/sandbox/linux-sandbox/689/execroot/_main/_tmp/4570b71ef2441e1548bbbb5c539c5a14/10570750018958606178/temp_path/0123456789abcdef000000000000000000010000000000001600000000000000-10", final_path: "/home/runner/.bazel/sandbox/linux-sandbox/689/execroot/_main/_tmp/4570b71ef2441e1548bbbb5c539c5a14/5840616205448497430/content_path/0123456789abcdef000000000000000000010000000000000123456789abcdef-10"
    at nativelink-store/src/filesystem_store.rs:702
    in nativelink_store::filesystem_store::filesystem_store_emplace_file

test rename_on_insert_fails_due_to_filesystem_error_proper_cleanup_happens ... ok
test temp_files_get_deleted_on_replace_test ... ok
test update_file_future_drops_before_rename ... ok
test update_with_whole_file_closes_file ... ok
test update_with_whole_file_slow_path_when_low_file_descriptors ... ok
test update_with_whole_file_uses_same_inode ... ok
test valid_results_after_shutdown_test ... ok

failures:

---- file_continues_to_stream_on_content_replace_test stdout ----
thread 'file_continues_to_stream_on_content_replace_test' panicked at nativelink-store/tests/filesystem_store_test.rs:474:5:
assertion failed: `(left == right)`: Expected file content to match

Diff < left / right > :
<b"12344[321](https://github.com/TraceMachina/nativelink/actions/runs/10648367662/job/29517335787?pr=1276#step:4:322)0"
>[
>    49,
>    50,
>    51,
>    52,
>    53,
>    54,
>    55,
>    56,
>    57,
>]

note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

---- file_gets_cleans_up_on_cache_eviction stdout ----
Error: Error { code: Internal, messages: ["Sender dropped before sending EOF", "During first read of buf_channel::take()", "Error reading remaining bytes"] }

failures:
    file_continues_to_stream_on_content_replace_test
    file_gets_cleans_up_on_cache_eviction

test result: FAILED. 17 passed; 2 failed; 0 ignored; 0 measured; 0 filtered out; finished in 2.88s