rust-lang / docs.rs

crates.io documentation generator
https://docs.rs
MIT License
956 stars 193 forks source link

builder: don't drop async sqlx connection in sync context #2476

Closed syphar closed 3 months ago

syphar commented 3 months ago

We were seeing a new error in sentry, starting yesterday evening:

panic
this functionality requires a Tokio context

interesting parts of the stack trace:

grafik

I imagine every place in the build_package method where we use ? to return early could lead to this error. But strange is:

  1. it didn't happen until now
  2. every early return should actually lead to another sentry error, which we don't see.

So for now this PR is a guess at what the temporary solution could be.

Long-term, when we have separate runtimes ( aka: separate webserver & builder processes) I imagine that the builder also would have a global runtime through tokio::main.

Nemo157 commented 3 months ago

One alternative would be to let guard = context.runtime().enter(); early in the thread so that the runtime is accessible when the connection is dropped in sync context. It only wants to spawn a task, and should never be doing anything other than that from a Drop, so that should work.

Nemo157 commented 3 months ago

Inspecting the server logs it seems to be https://crates.io/crates/aymr somehow causing the issue when uploading its source, we don't increment the attempt counter or last_attempt when a panic occurs so I have manually set it to 5 for this crate to stop it from rebuilding continuously.

Nemo157 commented 3 months ago

At a guess, looking in the crate there's a 0 byte file, which probably gets stored without compression and hits https://github.com/rust-lang/docs.rs/blob/fe3197616f3fa0497520a2d5eb89d09937a510c4/src/storage/archive_index.rs#L66

EDIT: Wrong deduction, creating the source archive works, the last log line before the panic is docs_rs::db::add_package: Adding package into database

syphar commented 3 months ago

One alternative would be to let guard = context.runtime().enter(); early in the thread so that the runtime is accessible when the connection is dropped in sync context. It only wants to spawn a task, and should never be doing anything other than that from a Drop, so that should work.

You're right, this is the better solution.

syphar commented 3 months ago

Inspecting the server logs it seems to be https://crates.io/crates/aymr somehow causing the issue when uploading its source, we don't increment the attempt counter or last_attempt when a panic occurs so I have manually set it to 5 for this crate to stop it from rebuilding continuously.

~At a guess, looking in the crate there's a 0 byte file, which probably gets stored without compression and hits ~

https://github.com/rust-lang/docs.rs/blob/fe3197616f3fa0497520a2d5eb89d09937a510c4/src/storage/archive_index.rs#L66

EDIT: Wrong deduction, creating the source archive works, the last log line before the panic is docs_rs::db::add_package: Adding package into database

Ah, so there is some error happening in add_package_into_database, and we don't see it because the panic happens before we return or report the error.

Also we could ask ourselves if we should increase attempts on panics, or stop building?

syphar commented 3 months ago

I just had another idea for an generic solution for handling entering the runtime.

2476

syphar commented 3 months ago

closing in favor of #2479