sciencefair-land / sciencefair

The futuristic, fabulous and free desktop app for working with scientific literature :microscope: :book:
https://sciencefair-app.com
MIT License
603 stars 52 forks source link

Search for homo naledi fails (v. 1.0.3) #100

Closed pvanheus closed 7 years ago

pvanheus commented 7 years ago

With the eLife datasource enabled, search for "Homo naledi" fails.

It should return papers such as DOI 10.7554/eLife.10627.

As discussed on IRC, this might be related to the title having <italics>Homo naledia</italics> markup.

blahah commented 7 years ago

Thanks for reporting this. As discussed, I think the tokeniser we run before indexing the documents is failing to strip the XML tags. Should be a simple fix (famous last words!).

blahah commented 7 years ago

Fixed in the search engine by adding a pre-processing step to strip XML tags https://github.com/blahah/yunodb/commit/15a706d5b8a833774e050a6fddf6e3a8c8b3cd63

pvanheus commented 7 years ago

Seems not to fix things:

image

is there an index that needs to be rebuilt or a cache that needs to be emptied or something?

blahah commented 7 years ago

ah, yes you'll need to delete the search index:

rm -rf ~/.sciencefair

will nuke the whole thing