Open artofbeinghuman opened 11 months ago
Hi @artofbeinghuman the PDF full text is no longer automatically fed to GPT-4 at the moment, because it consumes a lot of tokens ($), significantly increases the processing time and doesn't always work as expected, which upset some earlier users.
I still very much like to support this important feature, just that I haven't figured out what I think is an acceptable user experience. Your suggestions are welcome.
Right now, if you can locate the relevant section within a PDF to summarize, you can feed it to GPT-4 vision: https://github.com/lifan0127/ai-research-assistant#visual-analysis-gpt-4-vision
Could inclusion of PDFs be a user option? There are certainly situations where it is desirable.
Hi @richardkaplan, now that certain GPT models can handle much longer context window (32K or 128K) than one year ago, we are going to enable PDF fulltext option in the next major release (March or April).
Thanks - much appreciated
Hello, any updates on this? I would like to include the whole PDF text in some cases. It would be awesome if this could be a user setting in the chat.
Hi @martonszep , yes, still working on it.
Dear @lifan0127 first of all I want to thank for your great work! When it comes to my usage I would really love to have the option to pass the full PDF text to the LLM as I wanted ARIA to generate summaries over every chapter of the PDF. Do you think that is possible? Are you still working on a way to pass the whole file to the GPT model?
Hi @cfessler, yes the full text feature is still being developed. Thanks for your support!
Hi,
I used ARIA so far to get an insight into my libraries' papers. Now I have a paper and its PDF in Zotero but by chance no abstract in the abstract property field and it throws this error/answer: "The paper with item ID 802 does not provide an abstract or additional information that would allow for a summary of its key arguments."
I thought ARIA sends the attached PDF to GPT4 for analysis, so not having the abstract shouldn't be a dealbreaker.
Would you please explain, what is happening under the hood? Are the pdfs taken into account, and if so, why this answer?
Thank you! Marvin