ESBigeard / paper_graph

Dev/tools repo for a project about scientific papers mining to construct graphs
2 stars 0 forks source link

Dead Link. Cannot download the corpus. #1

Open KorigamiK opened 1 week ago

KorigamiK commented 1 week ago

https://www.dropbox.com/s/0bc6c2fmhz526mo/fulltext_tei.tar.gz?dl=0%5D

Link doesn't work anymore, would appreciate an update

ESBigeard commented 1 week ago

Hi! if you don't mind me asking, what do you want to use this repo for? This is work I did for a specific job that I didn't intend to reuse for another purpose, so it would help me understand what I should re-upload to make it useful.

KorigamiK commented 1 week ago

Hi! if you don't mind me asking, what do you want to use this repo for?

I'm working on my research which also deals with analyzing the research datasets. Generating the corpus for the creation of KGs from papers is ideal. I am interested in the approach you took to to generate the corpora so I could test it out as well.

ESBigeard commented 2 days ago

Hi again,

The corpus was generated using this tool : https://github.com/kermitt2/grobid It converts the PDF of an article into into a TEI XML file. It should be all you need to make this repo work with your own articles. I'll edit the readme to include this link, thank you for bringing to my attention that it was missing. Tell me if you still have questions or have any other issue.