Open kubukoz opened 1 year ago
Forgot to post the actual question :D
Do you have any hints as to why this could be happening or what I can do to alleviate the issue?
That's interesting. I just tried a 8GB NAR and it uploaded fine. How busy was the server at the time?
Just handling this one client. It was 10 jobs, though...
It seems like after some retries we got it to pass.
I got the same errors with sqlite
:
2023-02-12T20:10:00.892876Z ERROR attic_server::error: Database error: Failed to acquire connection from pool
0: tokio::task::runtime.spawn
with kind=task task.name= task.id=2362343 loc.file="server/src/api/v1/upload_path.rs" loc.line=406 loc.col=13 at /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-vendor-cargo-deps/c19b7c6f923b580ac259164a89f2577984ad5ab09ee9d583b888f934adbbe8d0/tokio-1.23.0/src/util/trace.rs:16
1: attic_server::api::v1::upload_path::upload_path
at server/src/api/v1/upload_path.rs:115
2: tower_http::trace::make_span::request
with method=PUT uri=/_api/v1/upload-path version=HTTP/1.0
at /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-vendor-cargo-deps/c19b7c6f923b580ac259164a89f2577984ad5ab09ee9d583b888f934adbbe8d0/tower-http-0.3.5/src/trace/make_span.rs:116
3: tokio::task::runtime.spawn with kind=task task.name= task.id=2348982 loc.file="/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-vendor-cargo-deps/c19b7c6f923b580ac259164a89f2577984ad5ab09ee9d583b888f934adbbe8d0/hyper-0.14.23/src/common/exec.rs" loc.line=49 loc.col=21
at /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-vendor-cargo-deps/c19b7c6f923b580ac259164a89f2577984ad5ab09ee9d583b888f934adbbe8d0/tokio-1.23.0/src/util/trace.rs:16
2023-02-12T20:10:00.892951Z ERROR tower_http::trace::on_failure: response failed classification=Status code: 500 Internal Server Error latency=451119 ms
I see a 100% usage of my SSD (where the sqlite is saved on, not where the data is on) when uploading. Uploads take a long time. But AFAIS the network part is short. The network usage spikes and then idles, whereas the SSD is under load. Some of the (bigger) uploads fail with the above error.
I switched to postgres and the error disappears.
I guess the sqlite database is not suitable for more than just a few tests..
I encountered the same errors, similarly switching to postgres seems to fix it
Jun 05 14:10:35 attic atticd[2788]: 2023-06-05T12:10:35.369673Z ERROR attic_server::error: Database error: Failed to acquire connection from pool
Jun 05 14:10:35 attic atticd[2788]: 0: tokio::task::runtime.spawn
Jun 05 14:10:35 attic atticd[2788]: with kind=task task.name= task.id=35784 loc.file="server/src/api/v1/upload_path.rs" loc.line=408 loc.col=13
Jun 05 14:10:35 attic atticd[2788]: at /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-vendor-cargo-deps/c19b7c6f923b580ac259164a89f2577984ad5ab09ee9d583b888f934adbbe8d0/tokio-1.28.2/src/util/trace.rs:16
Jun 05 14:10:35 attic atticd[2788]: 1: attic_server::api::v1::upload_path::upload_path
Jun 05 14:10:35 attic atticd[2788]: at server/src/api/v1/upload_path.rs:117
Jun 05 14:10:35 attic atticd[2788]: 2: tower_http::trace::make_span::request
Jun 05 14:10:35 attic atticd[2788]: with method=PUT uri=/_api/v1/upload-path version=HTTP/1.1
Jun 05 14:10:35 attic atticd[2788]: at /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-vendor-cargo-deps/c19b7c6f923b580ac259164a89f2577984ad5ab09ee9d583b888f934adbbe8d0/tower-http-0.4.0/src/trace/make_span.rs:116
Jun 05 14:10:35 attic atticd[2788]: 3: tokio::task::runtime.spawn
Jun 05 14:10:35 attic atticd[2788]: with kind=task task.name= task.id=14446 loc.file="/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-vendor-cargo-deps/c19b7c6f923b580ac259164a89f2577984ad5ab09ee9d583b888f934adbbe8d0/hyper-0.14.26>
Jun 05 14:10:35 attic atticd[2788]: at /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-vendor-cargo-deps/c19b7c6f923b580ac259164a89f2577984ad5ab09ee9d583b888f934adbbe8d0/tokio-1.28.2/src/util/trace.rs:16
Jun 05 14:10:35 attic atticd[2788]: 2023-06-05T12:10:35.370482Z ERROR tower_http::trace::on_failure: response failed classification=Status code: 500 Internal Server Error latency=413940 ms
Jun 05 14:13:44 attic atticd[2788]: 2023-06-05T12:13:44.420429Z ERROR attic_server::error: Database error: Failed to acquire connection from pool
Jun 05 14:13:44 attic atticd[2788]: 0: tokio::task::runtime.spawn
Jun 05 14:13:44 attic atticd[2788]: with kind=task task.name= task.id=35771 loc.file="server/src/api/v1/upload_path.rs" loc.line=408 loc.col=13
Jun 05 14:13:44 attic atticd[2788]: at /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-vendor-cargo-deps/c19b7c6f923b580ac259164a89f2577984ad5ab09ee9d583b888f934adbbe8d0/tokio-1.28.2/src/util/trace.rs:16
Jun 05 14:13:44 attic atticd[2788]: 1: attic_server::api::v1::upload_path::upload_path
Jun 05 14:13:44 attic atticd[2788]: at server/src/api/v1/upload_path.rs:117
Jun 05 14:13:44 attic atticd[2788]: 2: tower_http::trace::make_span::request
Jun 05 14:13:44 attic atticd[2788]: with method=PUT uri=/_api/v1/upload-path version=HTTP/1.1
Jun 05 14:13:44 attic atticd[2788]: at /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-vendor-cargo-deps/c19b7c6f923b580ac259164a89f2577984ad5ab09ee9d583b888f934adbbe8d0/tower-http-0.4.0/src/trace/make_span.rs:116
Jun 05 14:13:44 attic atticd[2788]: 3: tokio::task::runtime.spawn
Jun 05 14:13:44 attic atticd[2788]: with kind=task task.name= task.id=30749 loc.file="/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-vendor-cargo-deps/c19b7c6f923b580ac259164a89f2577984ad5ab09ee9d583b888f934adbbe8d0/hyper-0.14.26>
Jun 05 14:13:44 attic atticd[2788]: at /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-vendor-cargo-deps/c19b7c6f923b580ac259164a89f2577984ad5ab09ee9d583b888f934adbbe8d0/tokio-1.28.2/src/util/trace.rs:16
Jun 05 14:13:44 attic atticd[2788]: 2023-06-05T12:13:44.420936Z ERROR tower_http::trace::on_failure: response failed classification=Status code: 500 Internal Server Error latency=339846 ms
Jun 05 14:30:40 attic atticd[2788]: 2023-06-05T12:30:40.187559Z ERROR attic_server::error: Database error: Failed to acquire connection from pool
Jun 05 14:30:40 attic atticd[2788]: 0: tokio::task::runtime.spawn
Jun 05 14:30:40 attic atticd[2788]: with kind=task task.name= task.id=76255 loc.file="server/src/api/v1/upload_path.rs" loc.line=408 loc.col=13
Jun 05 14:30:40 attic atticd[2788]: at /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-vendor-cargo-deps/c19b7c6f923b580ac259164a89f2577984ad5ab09ee9d583b888f934adbbe8d0/tokio-1.28.2/src/util/trace.rs:16
Jun 05 14:30:40 attic atticd[2788]: 1: attic_server::api::v1::upload_path::upload_path
Jun 05 14:30:40 attic atticd[2788]: at server/src/api/v1/upload_path.rs:117
Jun 05 14:30:40 attic atticd[2788]: 2: tower_http::trace::make_span::request
Jun 05 14:30:40 attic atticd[2788]: with method=PUT uri=/_api/v1/upload-path version=HTTP/1.1
Jun 05 14:30:40 attic atticd[2788]: at /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-vendor-cargo-deps/c19b7c6f923b580ac259164a89f2577984ad5ab09ee9d583b888f934adbbe8d0/tower-http-0.4.0/src/trace/make_span.rs:116
Jun 05 14:30:40 attic atticd[2788]: 3: tokio::task::runtime.spawn
Jun 05 14:30:40 attic atticd[2788]: with kind=task task.name= task.id=61183 loc.file="/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-vendor-cargo-deps/c19b7c6f923b580ac259164a89f2577984ad5ab09ee9d583b888f934adbbe8d0/hyper-0.14.26>
Jun 05 14:30:40 attic atticd[2788]: at /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-vendor-cargo-deps/c19b7c6f923b580ac259164a89f2577984ad5ab09ee9d583b888f934adbbe8d0/tokio-1.28.2/src/util/trace.rs:16
Jun 05 14:30:40 attic atticd[2788]: 2023-06-05T12:30:40.187967Z ERROR tower_http::trace::on_failure: response failed classification=Status code: 500 Internal Server Error latency=304195 ms
Had the same error with the db in an sshfs mounted folder, moving it out of there fixed it for me
I have the same problem
I'm seeing the same problem, mostly when pushing the Nvidia drivers which is about 500MB in size. The server is hosted on a Raspberry Pi 4 with Postgres for the DB and the actual files are stored on an NFS share. Sure it has to work on it but it's not like it's 100% pinned like it is compiling something.
nixos via ❄️ impure (nixosbuildshell-env) on master ❯ attic push -j 1 prod /run/current-system
⚙️ Pushing 3 paths to "prod" on "2li" (505 already cached, 2424 in upstream)...
❌ dnzqdd04mk1i4c3lfq1g2dx20mp5cb84-nvidia-x11-545.29.02-6.1.90: error sending request for url (https://cache.zweili.org/_api/v1/upload-path): connection error: unexpected end of file
✅ p3ac2311nbr988jh054vg9xl6i9adjww-SideQuest-usr-target (2.88 MiB/s, 100.0% deduplicated)
✅ 041mqxjdqcv9iflbc94gk12xmp2g7zq6-nvidia-x11-545.29.02-6.1.90-bin (13.65 MiB/s, 100.0% deduplicated)
Error: error sending request for url (https://cache.zweili.org/_api/v1/upload-path): connection error: unexpected end of file
2024-05-09T18:19:53.515913Z ERROR attic_server::error: Database error: Failed to acquire connection from pool: Connection pool timed out
May 09 20:19:53 management atticd[1081181]: 0: tokio::task::runtime.spawn
May 09 20:19:53 management atticd[1081181]: with kind=task task.name= task.id=947551 loc.file="server/src/api/v1/upload_path.rs" loc.line=408 loc.col=13
May 09 20:19:53 management atticd[1081181]: at /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-vendor-cargo-deps/c19b7c6f923b580ac259164a89f2577984ad5ab09ee9d583b888f934adbbe8d0/tokio-1.34.0/src/util/trace.rs:17
May 09 20:19:53 management atticd[1081181]: 1: attic_server::api::v1::upload_path::upload_path
May 09 20:19:53 management atticd[1081181]: at server/src/api/v1/upload_path.rs:117
May 09 20:19:53 management atticd[1081181]: 2: tower_http::trace::make_span::request
May 09 20:19:53 management atticd[1081181]: with method=PUT uri=/_api/v1/upload-path version=HTTP/1.1
May 09 20:19:53 management atticd[1081181]: at /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-vendor-cargo-deps/c19b7c6f923b580ac259164a89f2577984ad5ab09ee9d583b888f934adbbe8d0/tower-http-0.4.4/src/trace/make_span.rs:109
May 09 20:19:53 management atticd[1081181]: 3: tokio::task::runtime.spawn
May 09 20:19:53 management atticd[1081181]: with kind=task task.name= task.id=943769 loc.file="/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-vendor-cargo-deps/c19b7c6f923b580ac259164a89f2577984ad5ab09ee9d583b888f934adbbe8d0/hyper-0.14.27/src/common/exec.rs" loc.line=49 loc.col=21
May 09 20:19:53 management atticd[1081181]: at /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-vendor-cargo-deps/c19b7c6f923b580ac259164a89f2577984ad5ab09ee9d583b888f934adbbe8d0/tokio-1.34.0/src/util/trace.rs:17
May 09 20:19:53 management atticd[1081181]: 2024-05-09T18:19:53.516222Z ERROR tower_http::trace::on_failure: response failed classification=Status code: 500 Internal Server Error latency=45110 ms
The related logs for postgres I'm seeing:
ay 15 21:46:45 management postgres[2518741]: [2518741] ERROR: insert or update on table "chunkref" violates foreign key constraint "fk_chunkref_nar"
May 15 21:46:45 management postgres[2518741]: [2518741] DETAIL: Key (nar_id)=(7524) is not present in table "nar".
May 15 21:46:45 management postgres[2518741]: [2518741] STATEMENT: INSERT INTO "chunkref" ("nar_id", "seq", "chunk_id", "chunk_hash", "compression") VALUES ($1, $2, $3, $4, $5) RETURNING "id"
May 15 21:46:45 management postgres[2518782]: [2518782] ERROR: insert or update on table "chunkref" violates foreign key constraint "fk_chunkref_nar"
May 15 21:46:45 management postgres[2518782]: [2518782] DETAIL: Key (nar_id)=(7524) is not present in table "nar".
May 15 21:46:45 management postgres[2518782]: [2518782] STATEMENT: INSERT INTO "chunkref" ("nar_id", "seq", "chunk_id", "chunk_hash", "compression") VALUES ($1, $2, $3, $4, $5) RETURNING "id"
May 15 21:46:45 management postgres[2518746]: [2518746] ERROR: insert or update on table "chunkref" violates foreign key constraint "fk_chunkref_nar"
May 15 21:46:45 management postgres[2518746]: [2518746] DETAIL: Key (nar_id)=(7524) is not present in table "nar".
May 15 21:46:45 management postgres[2518746]: [2518746] STATEMENT: INSERT INTO "chunkref" ("nar_id", "seq", "chunk_id", "chunk_hash", "compression") VALUES ($1, $2, $3, $4, $5) RETURNING "id"
Hi! Thank you for working on this. I was giving Attic a try but on some uploads (notably, this one is several hundred MBs) it's failing to push.
I'm getting this error:
I'm using the default db, which I assume is a local sqlite... so it seems like a weird error to see. It's pretty reproducible.
My config (mostly the defaults from the docs, really):
Client error:
Full logs of the run: