I noticed some CI test failures, and I was able to reproduce one locally. (The CI logs were unhelpful because we don't set RUST_LOG=info on cargo test, and the failure happened to be during an in-process integration test rather than a containerized one, which logs by default)
Logs
---- janus::janus_in_process_histogram_4_buckets stdout ----
2023-11-16T17:19:39.746918Z INFO testcontainers::core::logs: Found message after comparing 22 lines
at /home/david/.cargo/registry/src/index.crates.io-6f17d22bba15001f/testcontainers-0.15.0/src/core/logs.rs:72 on ThreadId(11)
2023-11-16T17:19:39.914748Z INFO janus_aggregator::binaries::aggregator: Running aggregator, aggregator_bound_address: 127.0.0.1:36885
at aggregator/src/binaries/aggregator.rs:141 on ThreadId(9)
2023-11-16T17:19:40.059033Z INFO janus_aggregator::aggregator::aggregation_job_creator: Starting job creation worker, task_id: YwgbExQR41vvA1vY8mGTDPJW8jhSES0tLWWiC9yPZdA
at aggregator/src/aggregator/aggregation_job_creator.rs:193 on ThreadId(4)
in janus_aggregator::aggregator::aggregation_job_creator::update_tasks
2023-11-16T17:19:40.066571Z INFO janus_aggregator::binaries::aggregator: Running aggregator, aggregator_bound_address: 127.0.0.1:37021
at aggregator/src/binaries/aggregator.rs:141 on ThreadId(3)
2023-11-16T17:19:40.069259Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(9)
in janus_aggregator_core::endpointwithroute: "/hpke_config", method: GET
2023-11-16T17:19:40.071421Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(9)
in janus_aggregator_core::endpointwithroute: "/hpke_config", method: GET
2023-11-16T17:19:40.200284Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(4)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:40.326959Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(9)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:40.452924Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(4)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:40.566016Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(9)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:40.692635Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(9)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:40.811502Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(9)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:40.929997Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(3)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:41.049753Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(9)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:41.101860Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(4)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/aggregation_jobs/:aggregation_job_id", method: PUT
2023-11-16T17:19:41.102156Z ERROR janus_aggregator_core::datastore: error: DB error: db error: ERROR: could not serialize access due to concurrent update
at aggregator_core/src/datastore.rs:3904 on ThreadId(9)
in janus_aggregator_core::datastore::put_batchwithbatch: Batch { task_id: TaskId(YwgbExQR41vvA1vY8mGTDPJW8jhSES0tLWWiC9yPZdA), batch_identifier: Interval { start: Time(1700150400), duration: Duration(28800) }, state: Open, outstanding_aggregation_jobs: 0, client_timestamp_interval: Interval { start: Time(1700150400), duration: Duration(1) } }
in janus_aggregator::aggregator::handle_aggregate_initwithbatch_aggregation_shard_count: 32, aggregation_job_id: AggregationJobId(y6WjitlZWzgNoIx8hRPIUg), task_id: TaskId(YwgbExQR41vvA1vY8mGTDPJW8jhSES0tLWWiC9yPZdA)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/aggregation_jobs/:aggregation_job_id", method: PUT
2023-11-16T17:19:41.105126Z ERROR janus_aggregator_core::datastore: error: not found in datastore
at aggregator_core/src/datastore.rs:1763 on ThreadId(7)
in janus_aggregator_core::datastore::release_aggregation_jobwithlease: Lease { leased: AcquiredAggregationJob { task_id: TaskId(YwgbExQR41vvA1vY8mGTDPJW8jhSES0tLWWiC9yPZdA), aggregation_job_id: AggregationJobId(-NwfB10EcZ_xnqEwowRnFw), query_type: TimeInterval, vdaf: Prio3Histogram { length: 4, chunk_length: 2 } }, lease_expiry_time: 2023-11-16T17:19:51, lease_token: LeaseToken(7ewx9oDxk6znUbJ1YsEkng), lease_attempts: 1 }
in janus_aggregator::binary_utils::job_driver::Job stepperwithacquired_job: AcquiredAggregationJob { task_id: TaskId(YwgbExQR41vvA1vY8mGTDPJW8jhSES0tLWWiC9yPZdA), aggregation_job_id: AggregationJobId(-NwfB10EcZ_xnqEwowRnFw), query_type: TimeInterval, vdaf: Prio3Histogram { length: 4, chunk_length: 2 } }
2023-11-16T17:19:41.105897Z ERROR janus_aggregator::binary_utils::job_driver: Couldn't step job, error: not found in datastore
at aggregator/src/binary_utils/job_driver.rs:199 on ThreadId(7)
in janus_aggregator::binary_utils::job_driver::Job stepperwithacquired_job: AcquiredAggregationJob { task_id: TaskId(YwgbExQR41vvA1vY8mGTDPJW8jhSES0tLWWiC9yPZdA), aggregation_job_id: AggregationJobId(-NwfB10EcZ_xnqEwowRnFw), query_type: TimeInterval, vdaf: Prio3Histogram { length: 4, chunk_length: 2 } }
2023-11-16T17:19:41.115472Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(7)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/aggregation_jobs/:aggregation_job_id", method: PUT
2023-11-16T17:19:41.165331Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(4)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:41.284147Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(4)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:41.403399Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(9)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:41.521914Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(4)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:41.640781Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(9)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:41.760185Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(4)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:41.879415Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(9)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:41.998731Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(9)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:42.121150Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(7)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:42.130343Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(7)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/aggregation_jobs/:aggregation_job_id", method: PUT
2023-11-16T17:19:42.232152Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(9)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:42.351672Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(7)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:42.471498Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(9)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:42.590611Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(7)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:42.708876Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(9)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:42.828307Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(7)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:42.946067Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(7)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:43.063733Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(9)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:43.151129Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(4)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/aggregation_jobs/:aggregation_job_id", method: PUT
2023-11-16T17:19:43.182798Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(9)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:43.300721Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(4)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:43.420302Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(9)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:43.536479Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(7)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:43.654051Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(4)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:43.772044Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(7)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:43.891549Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(7)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:44.007273Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(4)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:44.123091Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(4)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:44.169749Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(4)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/aggregation_jobs/:aggregation_job_id", method: PUT
2023-11-16T17:19:44.240771Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(7)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:44.359518Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(4)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:44.476760Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(7)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:44.594862Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(4)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:44.713263Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(7)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:44.831002Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(4)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:44.949526Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(5)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:45.067587Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(7)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:45.171550Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(4)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/aggregation_jobs/:aggregation_job_id", method: PUT
2023-11-16T17:19:45.182077Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(4)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:45.292334Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(5)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:45.407990Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(4)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:45.526158Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(5)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/reports", method: PUT
2023-11-16T17:19:45.543657Z INFO janus_aggregator_core: Finished handling request, status: "Created"
at aggregator_core/src/lib.rs:82 on ThreadId(7)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/collection_jobs/:collection_job_id", method: PUT
2023-11-16T17:19:45.554439Z INFO janus_aggregator_core: Finished handling request, status: "Accepted"
at aggregator_core/src/lib.rs:82 on ThreadId(7)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/collection_jobs/:collection_job_id", method: POST
2023-11-16T17:19:46.180187Z INFO janus_aggregator_core: Finished handling request, status: "Accepted"
at aggregator_core/src/lib.rs:82 on ThreadId(5)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/collection_jobs/:collection_job_id", method: POST
2023-11-16T17:19:46.188353Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(5)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/aggregation_jobs/:aggregation_job_id", method: PUT
2023-11-16T17:19:46.924042Z INFO janus_aggregator_core: Finished handling request, status: "Accepted"
at aggregator_core/src/lib.rs:82 on ThreadId(4)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/collection_jobs/:collection_job_id", method: POST
2023-11-16T17:19:47.541983Z INFO janus_aggregator_core: Finished handling request, status: "Accepted"
at aggregator_core/src/lib.rs:82 on ThreadId(4)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/collection_jobs/:collection_job_id", method: POST
2023-11-16T17:19:47.991607Z INFO janus_aggregator_core: Finished handling request, status: "Accepted"
at aggregator_core/src/lib.rs:82 on ThreadId(4)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/collection_jobs/:collection_job_id", method: POST
2023-11-16T17:19:48.267411Z INFO janus_aggregator_core: Finished handling request, status: "Accepted"
at aggregator_core/src/lib.rs:82 on ThreadId(4)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/collection_jobs/:collection_job_id", method: POST
2023-11-16T17:19:48.717190Z INFO janus_aggregator_core: Finished handling request, status: "Accepted"
at aggregator_core/src/lib.rs:82 on ThreadId(4)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/collection_jobs/:collection_job_id", method: POST
2023-11-16T17:19:49.140564Z INFO janus_aggregator_core: Finished handling request, status: "Accepted"
at aggregator_core/src/lib.rs:82 on ThreadId(5)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/collection_jobs/:collection_job_id", method: POST
2023-11-16T17:19:49.539732Z INFO janus_aggregator_core: Finished handling request, status: "Accepted"
at aggregator_core/src/lib.rs:82 on ThreadId(7)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/collection_jobs/:collection_job_id", method: POST
2023-11-16T17:19:50.190662Z INFO janus_aggregator_core: Finished handling request, status: "Accepted"
at aggregator_core/src/lib.rs:82 on ThreadId(5)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/collection_jobs/:collection_job_id", method: POST
2023-11-16T17:19:50.853313Z INFO janus_aggregator_core: Finished handling request, status: "Accepted"
at aggregator_core/src/lib.rs:82 on ThreadId(5)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/collection_jobs/:collection_job_id", method: POST
2023-11-16T17:19:51.184941Z ERROR janus_aggregator_core::datastore: error: already in datastore
at aggregator_core/src/datastore.rs:1804 on ThreadId(5)
in janus_aggregator_core::datastore::put_aggregation_jobwithaggregation_job: AggregationJob { task_id: TaskId(YwgbExQR41vvA1vY8mGTDPJW8jhSES0tLWWiC9yPZdA), aggregation_job_id: AggregationJobId(-NwfB10EcZ_xnqEwowRnFw), batch_id: (), client_timestamp_interval: Interval { start: Time(1700150400), duration: Duration(1) }, state: Finished, step: AggregationJobStep(0), last_request_hash: Some([172, 44, 166, 244, 129, 119, 143, 99, 59, 86, 83, 168, 237, 112, 135, 58, 162, 178, 124, 241, 193, 11, 8, 25, 181, 165, 147, 26, 63, 148, 148, 85]) }
in janus_aggregator::aggregator::handle_aggregate_initwithbatch_aggregation_shard_count: 32, aggregation_job_id: AggregationJobId(-NwfB10EcZ_xnqEwowRnFw), task_id: TaskId(YwgbExQR41vvA1vY8mGTDPJW8jhSES0tLWWiC9yPZdA)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/aggregation_jobs/:aggregation_job_id", method: PUT
2023-11-16T17:19:51.186896Z INFO janus_aggregator_core: Finished handling request, status: "OK"
at aggregator_core/src/lib.rs:82 on ThreadId(5)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/aggregation_jobs/:aggregation_job_id", method: PUT
2023-11-16T17:19:51.539102Z INFO janus_aggregator_core: Finished handling request, status: "Accepted"
at aggregator_core/src/lib.rs:82 on ThreadId(7)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/collection_jobs/:collection_job_id", method: POST
2023-11-16T17:19:52.119156Z INFO janus_aggregator_core: Finished handling request, status: "Accepted"
at aggregator_core/src/lib.rs:82 on ThreadId(5)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/collection_jobs/:collection_job_id", method: POST
2023-11-16T17:19:52.850883Z INFO janus_aggregator_core: Finished handling request, status: "Accepted"
at aggregator_core/src/lib.rs:82 on ThreadId(5)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/collection_jobs/:collection_job_id", method: POST
2023-11-16T17:19:53.176597Z ERROR janus_aggregator::aggregator: error: task YwgbExQR41vvA1vY8mGTDPJW8jhSES0tLWWiC9yPZdA: batch misalignment (own checksum = ReportIdChecksum([55, 52, 200, 68, 44, 251, 124, 146, 5, 146, 139, 52, 201, 200, 185, 207, 142, 196, 153, 202, 219, 239, 182, 103, 51, 196, 47, 203, 6, 115, 13, 36]), own report count = 47, peer checksum = ReportIdChecksum([73, 58, 169, 242, 154, 216, 202, 102, 118, 229, 97, 37, 93, 200, 97, 188, 42, 175, 175, 27, 162, 200, 117, 173, 29, 116, 22, 87, 82, 220, 255, 131]), peer report count = 46)
at aggregator/src/aggregator.rs:2786 on ThreadId(5)
in janus_aggregator::aggregator::handle_aggregate_sharewithbatch_aggregation_shard_count: 32, collector_hpke_config: HpkeConfig { id: HpkeConfigId(233), kem_id: X25519HkdfSha256, kdf_id: HkdfSha256, aead_id: Aes128Gcm, public_key: HpkePublicKey(HVD0Fd2w0eroNalQzjFqU83ST26o2VFcuAXWgaGpzUg) }, task_id: TaskId(YwgbExQR41vvA1vY8mGTDPJW8jhSES0tLWWiC9yPZdA)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/aggregate_shares", method: POST
2023-11-16T17:19:53.176689Z WARN janus_aggregator::aggregator::http_handlers: Error handling endpoint, error_code: "batch_mismatch", error: BatchMismatch(BatchMismatch { task_id: TaskId(YwgbExQR41vvA1vY8mGTDPJW8jhSES0tLWWiC9yPZdA), own_checksum: ReportIdChecksum([55, 52, 200, 68, 44, 251, 124, 146, 5, 146, 139, 52, 201, 200, 185, 207, 142, 196, 153, 202, 219, 239, 182, 103, 51, 196, 47, 203, 6, 115, 13, 36]), own_report_count: 47, peer_checksum: ReportIdChecksum([73, 58, 169, 242, 154, 216, 202, 102, 118, 229, 97, 37, 93, 200, 97, 188, 42, 175, 175, 27, 162, 200, 117, 173, 29, 116, 22, 87, 82, 220, 255, 131]), peer_report_count: 46 })
at aggregator/src/aggregator/http_handlers.rs:43 on ThreadId(5)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/aggregate_shares", method: POST
2023-11-16T17:19:53.176810Z INFO janus_aggregator_core: Finished handling request, status: "Bad Request"
at aggregator_core/src/lib.rs:82 on ThreadId(5)
in janus_aggregator_core::endpointwithroute: "/tasks/:task_id/aggregate_shares", method: POST
2023-11-16T17:19:53.177243Z ERROR janus_aggregator::aggregator: error: HTTP response status 400 Bad Request - Leader and helper disagree on reports aggregated in a batch.
at aggregator/src/aggregator.rs:3078 on ThreadId(5)
in janus_aggregator::aggregator::send_request_to_helperwithmethod: POST, route_label: "tasks/:task_id/aggregate_shares", content_type: "application/dap-aggregate-share-req", url: http://127.0.0.1:36885/tasks/YwgbExQR41vvA1vY8mGTDPJW8jhSES0tLWWiC9yPZdA/aggregate_shares
in janus_aggregator::aggregator::collection_job_driver::step_collection_jobwithlease: Lease { leased: AcquiredCollectionJob { task_id: TaskId(YwgbExQR41vvA1vY8mGTDPJW8jhSES0tLWWiC9yPZdA), collection_job_id: CollectionId(U7OO9tbwhcYwxYy7mqcHdQ), query_type: TimeInterval, vdaf: Prio3Histogram { length: 4, chunk_length: 2 } }, lease_expiry_time: 2023-11-16T17:20:03, lease_token: LeaseToken(makpgu7jnBl4LcZcQY0AIg), lease_attempts: 1 }
in janus_aggregator::binary_utils::job_driver::Job stepperwithacquired_job: AcquiredCollectionJob { task_id: TaskId(YwgbExQR41vvA1vY8mGTDPJW8jhSES0tLWWiC9yPZdA), collection_job_id: CollectionId(U7OO9tbwhcYwxYy7mqcHdQ), query_type: TimeInterval, vdaf: Prio3Histogram { length: 4, chunk_length: 2 } }
2023-11-16T17:19:53.177294Z ERROR janus_aggregator::aggregator::collection_job_driver: error: HTTP response status 400 Bad Request - Leader and helper disagree on reports aggregated in a batch.
at aggregator/src/aggregator/collection_job_driver.rs:77 on ThreadId(5)
in janus_aggregator::aggregator::collection_job_driver::step_collection_jobwithlease: Lease { leased: AcquiredCollectionJob { task_id: TaskId(YwgbExQR41vvA1vY8mGTDPJW8jhSES0tLWWiC9yPZdA), collection_job_id: CollectionId(U7OO9tbwhcYwxYy7mqcHdQ), query_type: TimeInterval, vdaf: Prio3Histogram { length: 4, chunk_length: 2 } }, lease_expiry_time: 2023-11-16T17:20:03, lease_token: LeaseToken(makpgu7jnBl4LcZcQY0AIg), lease_attempts: 1 }
in janus_aggregator::binary_utils::job_driver::Job stepperwithacquired_job: AcquiredCollectionJob { task_id: TaskId(YwgbExQR41vvA1vY8mGTDPJW8jhSES0tLWWiC9yPZdA), collection_job_id: CollectionId(U7OO9tbwhcYwxYy7mqcHdQ), query_type: TimeInterval, vdaf: Prio3Histogram { length: 4, chunk_length: 2 } }
2023-11-16T17:19:53.177318Z ERROR janus_aggregator::binary_utils::job_driver: Couldn't step job, error: Http(HttpErrorResponse { problem_details: HttpApiProblem { type_url: Some("urn:ietf:params:ppm:dap:error:batchMismatch"), status: Some(400), title: Some("Leader and helper disagree on reports aggregated in a batch."), detail: None, instance: None, additional_fields: {"taskid": String("YwgbExQR41vvA1vY8mGTDPJW8jhSES0tLWWiC9yPZdA")} }, dap_problem_type: Some(BatchMismatch) })
at aggregator/src/binary_utils/job_driver.rs:199 on ThreadId(5)
in janus_aggregator::binary_utils::job_driver::Job stepperwithacquired_job: AcquiredCollectionJob { task_id: TaskId(YwgbExQR41vvA1vY8mGTDPJW8jhSES0tLWWiC9yPZdA), collection_job_id: CollectionId(U7OO9tbwhcYwxYy7mqcHdQ), query_type: TimeInterval, vdaf: Prio3Histogram { length: 4, chunk_length: 2 } }
I noticed some CI test failures, and I was able to reproduce one locally. (The CI logs were unhelpful because we don't set
RUST_LOG=info
oncargo test
, and the failure happened to be during an in-process integration test rather than a containerized one, which logs by default)Logs