Closed GautamR-Samagra closed 10 months ago
Sample pdf here
We need to be able to extract text from it and be able to chunk it in the form of headings and related chunks.
We have tired 2 different approaches :
Collab for getting structure out using approach 2: link
PyMupdf approach works well. Moved to PDF Parser for now.
Sample pdf here
We need to be able to extract text from it and be able to chunk it in the form of headings and related chunks.
We have tired 2 different approaches :