FredHutch / wiki

SciWiki: Collective KnowledgeBase for Scientific Data and Use
https://sciwiki.fredhutch.org
Other
36 stars 46 forks source link

lunr search index concatenates words on either side of a newline #121

Closed dtenenba closed 6 years ago

dtenenba commented 6 years ago

URL of page(s) of concern Specify the URLs of the page(s) where there are technical problems.

All pages (problem is with search functionality)

This is really just a placeholder for me to come back to this issue since I don't have time to fix it right now.

Describe problem

If you have some html like this:

<p>
frumious
Bandersnatch
</p>

jekyll will put the string frumiousBandersnatch in the search index.

So then if you try and search for bandersnatch it will not be found.

If you start typing frumiousBandersnatch you will get a result.

For an actual example on our wiki, try searching for atotally. This will bring up the Docker page, but that string does not occur there. You will find a totally and if you view source you'll see those two words are separated by a newline.

This is not our issue to fix (unless we want to contribute a fix upstream), but we need to figure out which Ruby package is generating the index so that we can report the bug or see if anyone else has reported it. Haven't been able to find it yet.

bmcgough commented 6 years ago

The theme is building the lunr_store.js file using Liquid. It uses strip_newlines and replace along with strip_html to render the page excerpt into simple text. It is strip_newlines that removes /r?/n from the string. Two options. Liquid has also newline_to_br that replaces \n with <br />\n (note that it does nothing with \r). So we could modify the Liquid the theme uses to do newline_to_br and then replace <br /> with [space] and then strip_newlines.

I will look at making an issue upstream to see if we can get this fixed. This is clearly unintended and broken.

dtenenba commented 6 years ago

Thanks! If you make an issue upstream please link to it here....

bmcgough commented 6 years ago

https://github.com/mmistakes/minimal-mistakes/issues/1883

bmcgough commented 6 years ago

My patch for this was merged into master upstream (minimal-mistakes). Once we rebuild on the master branch of that, our search should be fixed. Does anyone know if that happens automatically, or if we pin a remote theme version somehow?

vortexing commented 6 years ago

I did not note a remote version, but changed/overwrote the minimum amount to avoid mistakes (partially why I like the theme, b/c of the name alone). So I believe once the site is successfully rebuilt AFTER they merged into the master, then we should be good.

bmcgough commented 6 years ago

The updated file is /assets/js/lunr/lunr-store.js. I did not see this overwritten in our repo.

dtenenba commented 6 years ago

This appears to be fixed now. Searching for atotally no longer works; searching for totally brings up many pages including the Docker page; and atotally no longer appears in the index.

vortexing commented 6 years ago

Who used the word totally????? (thanks BTW)

dtenenba commented 6 years ago

Sam, apparently. https://github.com/FredHutch/wiki/commit/dd74e27e7fd244dc11edf4ddec1d43eb7be94caf

vortexing commented 6 years ago

That's what that "blame" button should be for. Totally. ;)