Open thomasopsomer opened 7 years ago
Hi Thomas!
They are already extracted normally. In the latest version of GROBID, all the web external (and GOTO internal link in the document) annotation links are extracted in PDFAnnotation objects. However not yet outputted in the TEI yet - it will come!
Ah great ! I looked at the TEI but not the API before asking !
"In the latest version of GROBID" do you mean the stable 0.4.1 or the master branch of the repo ?
To use it directly with the library, given a Document
object I can retrieve the externals with getPDFAnnotations
?!
the master branch 0.4.2-SNAPSHOT
+1 it's working :)
Hey,
It could be nice to extract all external links in the PDF (in the text, or footnotes), for instance links to Github repositories or to online dataset... Just an idea :)