inspirehep / inspire

Official repo of the legacy INSPIRE-HEP overlay
http://projecthepinspire.net
17 stars 20 forks source link

HEP: remove non-numerical values from 999C5y #181

Open jacquerie opened 8 years ago

jacquerie commented 8 years ago

The records returned by this query: https://inspirehep.net/search?ln=en&ln=en&p=999C5y%3A%2F%5B%5E0-9%5D%2F&of=hb&action_search=Search&sf=earliestdate&so=d&rm=&rg=25&sc=0 have non-numerical values in the 999C5y field. Those values should be cleaned, to avoid over-complicating the corresponding DoJSON rules.

This is probably better handled by automated curation (@kaplun).

kaplun commented 6 years ago

If you check the actual values: https://inspirehep.net/search?ln=en&ln=en&p=999C5y%3A%2F%5B%5E0-9%5D%2F&of=t&action_search=Search&sf=earliestdate&so=d&rm=&rg=250&sc=0&ot=999C5y many do actually contain an year. Although prefixed by some letter (maybe the volume letter ended up there) or because they contain month or something.

Are you sure it's not something that can be filtered out in dojson? Otherwise @michamos what about a bibcheck rule?