cvpaperchallenge / Crux

Crux is a suite of LLM-empowered summarization and retrieval services for academic activity. Crux is developed by XCCV group of cvpaper.challenge.
MIT License
15 stars 2 forks source link

[Backend] Implement a pdf reader #7

Closed YoshikiKubotani closed 1 year ago

YoshikiKubotani commented 1 year ago

Why

Crux project reads pdfs and summarizes them by using open AI API. As a first step, implementing a pdf reader is necessary.

Definition of Done

How

gatheluck commented 1 year ago

I realized that LlamaIndex has native support to load from PDF, and convert into docs object. Maybe we should use this feature because LlamaIndex also has native connection between GPT and LangChain.

YoshikiKubotani commented 1 year ago

This github repo is also helpful to catch the point of how to use langchain for PDF reader.