kermitt2 / grobid

A machine learning software for extracting information from scholarly documents
https://grobid.readthedocs.io
Apache License 2.0
3.51k stars 451 forks source link

no added DOIs in TEI from consolidation (docker or local) and (crossref or glutton) #1139

Open bozo32 opened 3 months ago

bozo32 commented 3 months ago

This might be a really basic one... I've installed grobid on a linux box and configured it to consolodate using bibli-gluton (on the same computer). I installed glutton because when I ran consolodation with crossref I didn't see DOIs added to the TEI file references section. Same for biblioglutton.

Am I missing something? I thought that consolidation would enrich the TEI...and it does not seem to do so

what I have for logs is WARN [2024-07-07 12:41:47,702] com.scienceminer.glutton.web.resource.LookupController: DOI did not matched or did not pass post validation from glutton...there is only 1 DOI in the article as ingested and I see a bucketload of these INFO [2024-07-07 12:41:48,430] org.grobid.core.utilities.Consolidation: Consolidation service returns error (500) : Server Error

but bibliogutton seems to be properly installed

curl localhost:8075/service/data {"ISTEX size":"{istex_doi2ids=0, istex_istex2ids=0, istex_pii2ids=0}","Crossref Metadata stored size":"{crossref_Jsondoc=0}","Total metadata indexed size":"{crossref_Jsondoc=0}","HAL Metadata stored size":"{hal_Jsondoc=0}","PMID size":"{pmid_doi2ids=0, pmid_pmc2ids=0, pmid_pmid2ids=0}","DOI OA size":"{unpayWall_doiOAUrl=

and the config.yaml for grobid does say that it is on localhost:8075

not running anything in docker as the docker install (full one) didn't consolidate using crossref either...well..I didn't see the DOIs in the reference section of the TEI.

lfoppiano commented 3 months ago

Hi @bozo32, if you look at the counter from /service/data, you will notice that every counter is 0.

Biblio glutton needs to be "loaded" after installing it. Please check here: https://biblio-glutton.readthedocs.io/en/latest/Build-Databases/