soumik12345 / medrag-multi-modal

0 stars 0 forks source link

feat(loader): Implement `MarkerLoader` #5

Closed soumik12345 closed 4 days ago

soumik12345 commented 1 week ago

Implement a document loader named MarkerLoader that utilizes marker to transcribe pages from a PDF document into markdown formatted text and figures.

Note: We don't really need to index figures from each page separately since we're already doing that using ColPali. However, you could make the transcription of figures optional in MarkerLoader using a flag.