cern-sis / issues-inspire

0 stars 0 forks source link

GROBID author extraction is broken #462

Closed michamos closed 2 months ago

michamos commented 2 months ago

Due to https://github.com/kermitt2/grobid/issues/1093, author extraction through GROBID is broken. This affects both the author extraction through the editor on hep and in the workflows on next.

As a workaround, we should request explicitly the XML format by passing an accept: application/xml header in https://github.com/inspirehep/inspire-next/blob/fab1c19d33d7a5c6ba543aedcc9f9ddef814787a/inspirehep/modules/workflows/tasks/actions.py#L1094 and https://github.com/inspirehep/inspirehep/blob/da50059baa6b7470681e8587c7934f9cc9d324f9/backend/inspirehep/matcher/api.py#L236.

michamos commented 2 months ago

@drjova just to make sure: this is an issue on next too.