inspirehep / inspire-next

The INSPIRE repo.
https://inspirehep.net
GNU General Public License v3.0
59 stars 69 forks source link

HoldingPen: M&M testing: now split in separate issues #3468

Open ksachs opened 6 years ago

ksachs commented 6 years ago

Matching and merging is at the beginning of the journal workflow. If we do it on labs we cut off the journal workflow at DESY and have to move everything.

in bold : high priority italic : medium priority normal : nice to have

WARNING: when testing on labs DO NOT ACCEPT/CORE a journal record !!! It will be ingested to INSPIRE and might cause DOI conflicts with the still active DESY workflow.

Things that seem to be working - not fully tested

Workflow / Processes

GUI / Handling

Bug / Feature

Matches not found (examples)

Don't match

michamos commented 6 years ago

adding subjects - possible now only via editor. Need input slot in detailed view, input via single letter; final solution: on brief listing too. Records without subject should stay halted Subject guesser would be nice; at DESY subject guessing via authors is in place.

That is already possible on the detailed view, with single letter input.

ksachs commented 6 years ago

how? Only for user submissions. screenshot from 2018-06-14 16 10 26

michamos commented 6 years ago

Dealing with online first articles (DOI but no full pubnote): We don't want to curate / manually merge stuff twice. E.g. Elsevier sends an update for every step. 2 possibilities: -- auto-reject (without blocking the following full article). This is what we are currently doing. It is possible to filter these records at DESY before sending xml to labs. -- normal selection + ingest or auto-merge. I.e. do everything that can be done automatically and forget about information that would cause conflicts. Curation should be triggered only for record with full pubnote. The following versions would be matched automatically via DOI, so we don't have to do that again.

If you can filter them out easily, that would be easier. As this kind of things is probably publisher/journal dependent, it's something that we probably want to handle in hepcrawl (when we have non-DESY crawlers), not in the workflow.

michamos commented 6 years ago

I think it's a bug when the list is empty: screenshot-2018-6-14 inspire labs

michamos commented 6 years ago

599 not in new data-model

Is that a typo? neither me nor @annetteholtkamp know about that field, and we couldn't find any record that has it (actually we found one, but it was a typo for 595). 595 maps to _private_notes, 595_D maps to _desy_bookkeeping.

ksachs commented 6 years ago

599 is not part of the INSPIRE data-model but it contains comments from the publisher. E.g. from iop info about the conference (it's the only way to identify the conference) or stuff like "This paper includes data gathered with the 6.5 m Magellan Telescopes located at Las Campanas Observatory, Chile." It was a comment from Florian. I don't know if he discussed it with someone.

ksachs commented 6 years ago

This is now split in separate issues