adsabs / montysolr

Solr for Astrophysics Data System
https://ui.adsabs.harvard.edu
Other
51 stars 19 forks source link

Tokenize root of arxiv_class #114

Open aaccomazzi opened 5 years ago

aaccomazzi commented 5 years ago

We currently have the field arxiv_class which contains the classification of a paper provided by arXiv, which typically is in the form of category.SC (where SC represents an abbreviation for the subcategory). For instance astro-ph.SR indicates astrophysics (astro-ph), Solar and Stellar (SR).

Right now we index these as single tokens, which prevents one from doing a simple query such as arxiv_class:"astro-ph" to find them. I'm suggesting we should instead index these as

astro-ph
astro-ph.SR
JCRPaquin commented 9 months ago

We can change this to a hierarchical field, which would require some pipeline changes.