inveniosoftware / invenio

Invenio digital library framework
https://invenio.readthedocs.io
MIT License
625 stars 292 forks source link

Also use CFG_OAI_ID_FIELD for deduping #2812

Closed aw-bib closed 7 years ago

aw-bib commented 9 years ago

It seems advisable that find_records_from_extoaiid() also considers CFG_OAI_ID_FIELD for record matching.

Usecase:

  1. join2 repositories harvest each other to get copies from records. We use this process e.g. for authority records.
  2. join2 repositories act as institutional bibliographies, this implies to collect all publications from a given institution (even if it's also catalogued at another partner)

In case of authorities the authority ID in 035 ensures deduping. Recent upcoming use cases, however, also involve harvesting of bibliographic records, e.g. in case two institutions from the join2 collaboration work together, we want to ensure, that our scientists have to submit their publications only once. E.g. a scientist from Jülich should only need to submit to JuSER even if he would have to submit to iMPULSE as well. Harvesting ensures that publications from JuSER are exposed to iMPULSE. It doesn't make to much sense to move the OAI-ID to another field in this case. as we're really producing a second instance of the record in another service. At the same time from a bibliographic point of view usage of 035a for oai-ids will not work in our environment as we need a field for the id(tm) (ie. the one that get's stored with the linked records) in case of authority records. This is currently done via 035.

An alternative, more complex approach might be to allow a list of marc tags for CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG probably consolidating with CFG_BIBUPLOAD_EXTERNAL_SYSNO_TAG.

join2 might come up with an implementation of the simple case. (CFG_OAI_ID_FIELD).

tiborsimko commented 7 years ago

Closed by #3721.