cat-lemonade / PDFDataExtractor

A toolkit for automatically extracting semantic information from PDF files of scientific articles
https://pdfdataextractor.readthedocs.io/en/latest/?
MIT License
64 stars 11 forks source link

Adding to this repo #5

Closed Plikt closed 1 year ago

Plikt commented 1 year ago

Hey!

I'm working on a project with DeSci labs (desci.com) that is emphasizing generating automatic metadata. And we were hoping to use your repo as a basis for getting metadata from the text itself (including combining the functionality with LLMs and some other tooling).

Right now, we've just forked your repository and are keeping any changes we make separate. But we wanted to check in to see if you would be interested in us adding any of our changes to your repo to make it more robust.

Thanks!

cat-lemonade commented 1 year ago

Hi,

Thanks for reaching out! and glad that this package works for you. But for now, I have to keep this repo unchanged.

Many thanks!