quickwit-oss / tantivy

Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust
MIT License
11.41k stars 627 forks source link

Panicking in spawned Rayon tasks will abort the process by default. #2409

Closed adamreichold closed 1 month ago

adamreichold commented 1 month ago

C.f. https://docs.rs/rayon/latest/rayon/struct.ThreadPoolBuilder.html#method.panic_handler, i.e. spawn is fire-and-forget and hence cannot propagate the panics anywhere sensible. This leads to test flakiness like https://github.com/quickwit-oss/tantivy/actions/runs/9203311046/job/25314621091.

fulmicoton commented 1 month ago

@adamreichold thank you for the information! Maybe we should catch panics actually.

adamreichold commented 1 month ago

@adamreichold thank you for the information! Maybe we should catch panics actually.

We could just install a panic handler which logs and moves on, but that could leave the system in an inconsistent state. Debugging any issues resulting from that downstream might not be fun.

Alternatively, we could wrap all closure in catch_unwind and forward the panic payloads as errors, either as thread::Result<crate::Result<Vec<R>> or by adding a panic payload variant to tantivy::Result.

fulmicoton commented 1 month ago

I was more thinking of catch_unwind

fulmicoton commented 1 month ago

I am merging this. Let me know if you want to work on the catch_unwind stuff.

PSeitz commented 1 month ago

This wasn't merged