kermitt2 / biblio-glutton

A high performance bibliographic information service: https://biblio-glutton.readthedocs.io
117 stars 15 forks source link

Revisited result format for aggregated sources #72

Open kermitt2 opened 2 years ago

kermitt2 commented 2 years ago

As we are moving to more heterogeneous sources, crossref is one bibliographical record among others. To keep everything well separated and avoid destructive merging, the headache of unified representations and the mixture of automated, rule-based and original mapping/merging, we can define the following result format for an aggregated record:

{
  "doi": "10.1028/ijijij".
  "pmid": 52627,
  "pmcid": PMC7828282,
  "crossref": {},
  "pubmed": {},
  "hal": {},
  "dblp": {},
  "unpaywall": {}
}

We would have all strong identifiers are all present in the root of the JSON response. Then each full record from the original source is added, converted into Crossref format (which is like the unixref format).

API would be extended to select sub-set of source-specific records (e.g. source=['crossref','hal']), with default covering all available sources for the bibliographical object.

Finally in case of a matching response, where a disambiguation decision is taken, we can add a matching score at the root of the response.