inspirehep / inspire-next

The INSPIRE repo.
https://inspirehep.net
GNU General Public License v3.0
59 stars 69 forks source link

Complete support for Spires and Invenio keywords #683

Closed jmartinm closed 8 years ago

jmartinm commented 8 years ago
In [3]: set(SPIRES_KEYWORDS.values())
Out[3]:
{'037__c',
 '046__a',
 '246__a',
 '595',
 '650__a',
 '695__a',
 '773__y',
 '970__a',
 'address',
 'affiliation',
 'anyfield',
 'arXiv',
 'author',
 'authorcount',
 'caption',
 'cataloguer',
 'cited',
 'citedby',
 'collaboration',
 'collection',
 'confnumber',
 'country',
 'datecreated',
 'datemodified',
 'doc_type',
 'doi',
 'earliest_date',
 'exactauthor',
 'experiment',
 'firstauthor',
 'fulltext',
 'journal',
 'journalpage',
 'keyword',
 'note',
 'postalcode',
 'rank',
 'rawref',
 'recid',
 'reference',
 'refersto',
 'region',
 'reportnumber',
 'subject',
 'texkey',
 'title',
 'year'}

Each of them needs to be configured either at indexing time or at query time (where several fields and boosts can be defined)

The following document https://docs.google.com/spreadsheets/d/1Mm5FWjQV4V5pp3pQkRZ3LHxNJTAGSD_122huIaJFN7s/edit?usp=sharing constains for the current INSPIRE the list of allowed keywords and the corresponding MARC tags. The equivalent for each MARC tag can be found in the dojson configuration.

ioannistsanaktsidis commented 8 years ago

After checking all the fields and making some searches I get the following results. 037c:'astro-ph.CO' Labs:10 results, Legacy:10 results, '046a': Missing field, 246a:'Integrated Sachs Wolfe Effect and Rees Sciama Effect' Labs:1 result, Legacy:1 result, '595': not working, 650a:'Astrophysics' Labs:53 results, Legacy:0 results, 695a:'anisotropy' Labs: 7 results, Legacy: 7 results, 773y: 1998 Labs:22 results, Legacy: 22 results, '970__a: Missing', address:'CMS Collaboration' Labs:27 results, Legacy 27 results, affiliation:'KMI, Nagoya' Labs:3 results, Legacy 3 results, affiliation:'CMS Collaboration' Labs:27 results, Legacy: 27 results, arXiv:'arXiv:1404.5102' Labs:1 results, Legacy:0 results, 'author', 'authorcount', caption:'FERMILABPUB' Labs:30 results, Legacy: 30 results, 'cataloguer': Missing, 'cited': Missing, 'citedby': Missing, collaboration:'ATLAS' Labs: 25 results, Legacy: 25 results, collection:'HEP' Labs:426 results, Legacy:426 results, confnumber:'C15-09-07.5' Labs: 1 result, Legacy: 1 result, 'country', datecreated:'2006-03-22' Labs:1 result, Legacy: 0 results, datemodified:'2011-05-04' Labs:1 result, Legacy: 0 results, doc_type:'published' Labs:231 results, Legacy: 0 results, doi:'10.1093/ptep/ptu062' Labs:1 result, Legacy: 1 result, earliest_date:2014-01-01 Labs:54 results, Legacy: 0 results, 'exactauthor', experiment:'CERN-LHC-ATLAS' Labs:26 results, Legacy: 26 results, 'firstauthor', fulltext:'http://inspirehep.net/record/1291337/files/arXiv:1404.5102.pdf' Labs:1 result, Legacy: 0 results, journal: 1086512 Labs:2 results, Legacy:2 results, journal:'06B110' Labs: 1 result, Legacy:1 result, journal:'9' Labs:8 results, Legacy:186 results, journal:'PTEP' Labs: 2 results, Legacy: 2 results, journal:'2014' Labs:48 results, Legacy:48 results, journal:'C15-09-07.5' Labs:1 result, Legacy:1 result, journal:'Class. Quantum Grav. 15 (1998) 2153-2164' Labs: 1 result, Legacy:1 result, journal:2014 Labs: 48 results, Legacy:48 results, !!!!CAUSES EXCEPTION TO OTHER FIELDS journal:'Erratum' Labs: 3 results, Legacy:3 results 'journalpage:'06B110' Labs: 0 results, Legacy:1 result', keyword:'anisotropy' Labs:7 results, Legacy:7 results, keyword:'Papers' Labs:1 result, Legacy:1 result, note:'24 pages, 5 figures' Labs:1 result, Legacy:1 result,, 'postalcode':Missing, 'rank': Missing, 'rawref': Missing, recid: '1291337' Labs:1 result, Legacy:1 result, reference:'10.1016/j.astropartphys.2009.11.005' Labs:1 result, Legacy:1 result, reference:'arXiv:1403.3985' Labs:8 results, Legacy:8 results, reference:'Phys.Rev.,D69,083524' Labs:1 result, Legacy: 1 result, refersto: 462477 Labs:5 result, Legacy: 0 results, 'region': Missing, reportnumber:'CERN-PH-EP-2015-208' Labs:1 result, Legacy:1 result reportnumber:'arXiv:1404.5102' Labs:1 result, Legacy:1 result, subject:'astro-ph.CO' Labs:12 results, Legacy:12 results, texkey:'Nishizawa:2014vga' Labs:1 result, Legacy:1 result, texkey:'DA15-kp44g' Labs:1 result, Legacy:1 result, title:'Integrated Sachs Wolfe Effect and Rees Sciama Effect', Labs:1 result, Legacy:1 result, title:'five-dimensional Anti-de Sitter space' Labs:1 result, Legacy:1 result, title:'Solutions to Problems' Labs:1 result, Legacy:1 result, year:'2015' Labs:163 results, Legacy:163 results, year:2014-04-20 Labs: Exception, Legacy: 1 result, !!!CAUSES EXCEPTION WHEN PUBINFO IS PRESENT. year:'2014' Labs:76 results, Legacy:76 results,

ioannistsanaktsidis commented 8 years ago

In the above comment I will comment the searches that I am testing and the results that I am getting from labs and legacy. Ideally in the end we should have the same results. Cheers.

ioannistsanaktsidis commented 8 years ago

As discussed with @Panos512 for testing this correctly I should rebase invenio-search and invenio-parser? Is that right? cc @jmartinm @kaplun

kaplun commented 8 years ago

@ioannistsanaktsidis you should take the OPS version from our inspirehep private repositories. So basically you would fork inspirehep/invenio-query-parser and inspirehep/invenio-search and pip install -e . in your local environment. This should allow you directly edit them.

ioannistsanaktsidis commented 8 years ago

@kaplun thnx, although I do not need to edit them, just to check what results are returned ;)