Future-House / paper-qa

High accuracy RAG for answering questions from scientific documents with citations
Apache License 2.0
6.32k stars 598 forks source link

Feature request: opt-in for `MetadataProvider` failovers #495

Open jamesbraza opened 1 month ago

jamesbraza commented 1 month ago

Tonight, Semantic Scholar was having some downtime, which meant I couldn't build an index.

It would be nice to have some opt-in flag to allow discarding failures from MetadataProviders if there are 2+ providers that play the same role (e.g. Crossref + Semantic Scholar).

mskarlin commented 1 month ago

This should be the current behavior -- unless you've requested a field that comes specifically from S2 (shouldn't be the case for the defaults). Were you seeing an uncaught exception bubble up from S2 and was that why? Otherwise there would only be an issue if a paper was missing from S2 and crossref.

jamesbraza commented 1 month ago

Yeah I didn't record the stack, but somewhere within SemanticScholarProvider._query was getting 500 or 504's, eventually blew through retries, and then crashed get_directory_index