gee-community / GEE-Dev-Docs

A collaborative platform for accessing and submitting Google Earth Engine tutorials.
https://gee-community.github.io/GEE-Dev-Docs
8 stars 5 forks source link

Search interface #3

Open jdbcode opened 5 years ago

jdbcode commented 5 years ago

The "Just the Doc" has an integrated search interface that does not require us to build an index, which is nice, but we have no say over what is included in the search (page title, content, and URL are included). In some ways I think this is fine - it makes it very comprehensive, however, it could return too much to be useful. As @gee-contrib has pointed out, maybe we could have it only search in a given section or within a YAML field where a user assigns tags to the content - it looks like there might be ways to do this in the lunr.js engine and "Just the Doc" provides one example of making a more specific search. In terms of adding tags though, we could use the "labels UI component" which are included in search results and look good too - an easy indicator about what is included in a tutorial/example. I posted an example of using the tags here: https://gee-community.github.io/GEE-Dev-Docs/docs/methods/specific/methods-specific-topic2.html

gee-contrib commented 5 years ago

search would be a major issue if this Repo is to scale up. lunr.js does not support boolean operators e.g. AND or OR between search terms. I think this would be quite problematic at some point. I dug around and found http://elasticlunr.com/ which is based on lunr.js , has nearly identical syntax (so forking just-the-docs and replacing lunr with elasticlunr may not be too hard...) and does have boolean operators functionality.

Ideally, we would liked a 'smart search' which goes beyond words/terms and simple indexing, similar to sites such as StackExchange e.g. try asking a new question in https://stats.stackexchange.com/questions/ask or another sub-forum. I don't know how those are implemented... but having at least some AND operators would be good

jdbcode commented 5 years ago

Thanks for thinking of this and looking into alternatives to lunr.js. It's looking like we need to fork Just-the-Docs so we can alter "includes" for adding a footer in a more systematic way as well.

guy1ziv2 commented 5 years ago

I managed to get the default.html layout replaced by simply putting a copy of just-the-docs file and modifying it. To get a better search, we probably need to fork it...

guy1ziv2 commented 5 years ago

I am not sure anymore about lunr.js not having AND. I tried some searches and it seems to actually work...

guy1ziv2 commented 5 years ago

I think it supports a notation e.g. '+test +methods' to force both terms

On Mon, Feb 25, 2019 at 7:59 PM Justin Braaten notifications@github.com wrote:

Thanks for thinking of this and looking into alternatives to lunr.js. It's looking like we need to fork Just-the-Docs so we can alter "includes" for adding a footer in a more systematic way as well.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/gee-community/GEE-Dev-Docs/issues/3#issuecomment-467160610, or mute the thread https://github.com/notifications/unsubscribe-auth/AB_vO-eHalcAoMeJiwkIEL87nNuZjPbsks5vRECvgaJpZM4bQinH .

guy1ziv2 commented 5 years ago

I found that we can easily modify the search behaviour by overriding the .js file. To test this, the site now does NOT index the content of the pages. However, it the 'front matter' lists 'tags', I changed the code to index those (and the title, which I left from the original behaviour). As an example search 'cloud masking' finds the topic (but 'landsat' does not - it is not by the labels, check raw file to see tags)

EDIT: the search by tags is not working properly, but after I spent too much time on it I gave up for now...

guy1ziv2 commented 5 years ago

EDIT2: search by tags is working now. The search index is now based on title and 'tags' defined in Front Matter either as

tags: [tag1, tag2, tag3]

or

tags:

Note that tags / keywords that contain spaces are indexed as separate words. I could not find a workaround to this. In the example page https://raw.githubusercontent.com/gee-community/GEE-Dev-Docs/master/docs/methods/specific/methods-specific-topic2.md I used as a tag 'cloud-masking' instead of 'cloud masking'

jdbcode commented 5 years ago

Well done, Guy!

slattery commented 5 years ago

Nice work with lunr... I have yet to play with it, but flexsearch looks promising too. I was going to use it for autocompletes on a static page elsewhere.