oxidecomputer / omicron

Omicron: Oxide control plane
Mozilla Public License 2.0
252 stars 40 forks source link

test failure: test_expunge_timeseries_by_name_replicated #7051

Open davepacheco opened 1 week ago

davepacheco commented 1 week ago

On branch dap/drafts/dropshot-update commit fd95b560a8d991208798939800abedfb291bcc62, I saw this test failure:

-------
        FAIL [  19.261s] oximeter-db client::tests::test_expunge_timeseries_by_name_replicated

--- STDOUT:              oximeter-db client::tests::test_expunge_timeseries_by_name_replicated ---

running 1 test
test client::tests::test_expunge_timeseries_by_name_replicated ... FAILED

failures:

failures:
    client::tests::test_expunge_timeseries_by_name_replicated

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 227 filtered out; finished in 19.24s

--- STDERR:              oximeter-db client::tests::test_expunge_timeseries_by_name_replicated ---
log file: /dangerzone/omicron_tmp/oximeter_db-171441753ad1a41d-test_expunge_timeseries_by_name_replicated.23516.0.log
note: configured to log to "/dangerzone/omicron_tmp/oximeter_db-171441753ad1a41d-test_expunge_timeseries_by_name_replicated.23516.0.log"
thread 'client::tests::test_expunge_timeseries_by_name_replicated' panicked at oximeter/db/src/client/mod.rs:5013:18:
failed to get count of timeseries: DatabaseUnavailable("error sending request for url (http://[::1]:8123/?output_format_json_quote_64bit_integers=0&wait_end_of_query=1)")
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
WARN: dropped ClickHouse process without cleaning it up first (there may still be a child process running (PID 23520) and a temporary directory leaked, /dangerzone/omicron_tmp/oximeter_db-171441753ad1a41d-test_expunge_timeseries_by_name_replicated.23516.4-clickhouse-3snymO)
failed to clean up ClickHouse data dir:
- /dangerzone/omicron_tmp/oximeter_db-171441753ad1a41d-test_expunge_timeseries_by_name_replicated.23516.4-clickhouse-3snymO: File exists (os error 17)
WARN: dropped ClickHouse process without cleaning it up first (there may still be a child process running (PID 23521) and a temporary directory leaked, /dangerzone/omicron_tmp/oximeter_db-171441753ad1a41d-test_expunge_timeseries_by_name_replicated.23516.5-clickhouse-HzQ6Gz)
failed to clean up ClickHouse data dir:
- /dangerzone/omicron_tmp/oximeter_db-171441753ad1a41d-test_expunge_timeseries_by_name_replicated.23516.5-clickhouse-HzQ6Gz: File exists (os error 17)
WARN: dropped ClickHouse process without cleaning it up first (there may still be a child process running (PID 23517) and a temporary directory leaked, /dangerzone/omicron_tmp/oximeter_db-171441753ad1a41d-test_expunge_timeseries_by_name_replicated.23516.1-clickhouse-TbhHPx)
failed to clean up ClickHouse data dir:
- /dangerzone/omicron_tmp/oximeter_db-171441753ad1a41d-test_expunge_timeseries_by_name_replicated.23516.1-clickhouse-TbhHPx: File exists (os error 17)
WARN: dropped ClickHouse process without cleaning it up first (there may still be a child process running (PID 23518) and a temporary directory leaked, /dangerzone/omicron_tmp/oximeter_db-171441753ad1a41d-test_expunge_timeseries_by_name_replicated.23516.2-clickhouse-rIpciV)
failed to clean up ClickHouse data dir:
- /dangerzone/omicron_tmp/oximeter_db-171441753ad1a41d-test_expunge_timeseries_by_name_replicated.23516.2-clickhouse-rIpciV: File exists (os error 17)
WARN: dropped ClickHouse process without cleaning it up first (there may still be a child process running (PID 23519) and a temporary directory leaked, /dangerzone/omicron_tmp/oximeter_db-171441753ad1a41d-test_expunge_timeseries_by_name_replicated.23516.3-clickhouse-G6GNZO)
failed to clean up ClickHouse data dir:
- /dangerzone/omicron_tmp/oximeter_db-171441753ad1a41d-test_expunge_timeseries_by_name_replicated.23516.3-clickhouse-G6GNZO: File exists (os error 17)

This is probably annoying to reproduce directly from that commit because it contains a local patch to dropshot, but I'd be surprised if this change has anything to do with the failure.

davepacheco commented 1 week ago

Here's the log file: oximeter_db-171441753ad1a41d-test_expunge_timeseries_by_name_replicated.23516.0.log.gz

davepacheco commented 1 week ago

I did a full test run a few times and only saw this failure once.