openaire / iis

Information Inference Service of the OpenAIRE system
Apache License 2.0
20 stars 11 forks source link

Run experiments with Grobid as a potential Cermine replacement #1462

Open marekhorst opened 1 month ago

marekhorst commented 1 month ago

This task is mostly about running first experiments involving Grobid.

We could start with implementing pl.edu.icm.cermine.ContentExtractor equivalent involving Grobid for metadata and plaintext extraction and make it possible to run it from the command line in order to allow direct comparison with Cermine-based version.