allenai / papermage

library supporting NLP and CV research on scientific papers
https://papermage.org
Apache License 2.0
692 stars 54 forks source link

How to extract Authorname, Institution, Country from "authors" box #59

Open jasobro opened 10 months ago

jasobro commented 10 months ago

Hi,

At the moment it is possible to derive author information via doc.authors. Is it also possible to further finegrain this information and retrieve the authors name, their institution and country ? doc.authors returns all author information in a single string and I don't know how to retrieve the single entities.

juhoinkinen commented 10 months ago

@jasobro you could check if METEOR or AutoMETA works for your use case. Also, we have experimented in using a finetuned GPT3.5 model for extracting bibliographic metadata.

That said, if Papermage could provide more detailed metadata, that would be quite helpful for many!

xsank commented 7 months ago

@jasobro you could check if METEOR or AutoMETA works for your use case. Also, we have experimented in using a finetuned GPT3.5 model for extracting bibliographic metadata.

That said, if Papermage could provide more detailed metadata, that would be quite helpful for many!

your model link is 404 now...