Future-House / paper-qa

High accuracy RAG for answering questions from scientific documents with citations
Apache License 2.0
6.44k stars 618 forks source link

manually adding metadata #569

Open Litreviewing opened 1 month ago

Litreviewing commented 1 month ago

How can I manually add the metadatain python or CLI thats missing for documents as I dont have a semantic scholar or crossrefprovider API key

dosubot[bot] commented 1 month ago

Hey @Litreviewing! I'm here to help you with any bugs, questions, or contributions you have. Let's tackle this together!

You can manually add metadata to documents using the Docs object in Python. Here's an example:

from paperqa import Docs

# Paths to your documents
doc_paths = ("myfile.pdf", "myotherfile.pdf")

docs = Docs()

for doc in doc_paths:
    docs.add(
        path=doc,
        citation="Author, Title, Year",
        title="Document Title",
        doi="10.1234/example.doi",
        authors=["Author One", "Author Two"]
    )

# Now you can query your documents
answer = docs.query("What manufacturing challenges are unique to bispecific antibodies?")
print(answer.formatted_answer)

This allows you to add documents with metadata like citation, title, DOI, and authors without needing an API key for external providers [1].

To continue talking to Dosu, mention @dosu.