Kitware / UPennContrast

UPenn ?
https://upenn-contrast.netlify.com/
Apache License 2.0
8 stars 6 forks source link

Problem with new file search bar #501

Open arjunrajlab opened 11 months ago

arjunrajlab commented 11 months ago

Prefix search seems to work fine, but text search seems broken @bruyeret. See screenshot below. Somehow it is not finding the MS2 file that is in the directory (also visible in screenshot).

Not sure what happened, I had tested it before and it worked, but after the merge it seems to have stopped working.

image
bruyeret commented 11 months ago

This seems to come from MongoDB. When I open a MongoDB playground and do:

use('girder');
db.getCollection('folder').find({ $text: { $search: "squar" } });

I get a result with "square" in its name.

But when I replace "squar" with "squa", I don't get a result...

arjunrajlab commented 11 months ago

Oh interesting. We can just leave it, not a big deal.

bruyeret commented 11 months ago

I can maybe improve the UI?

arjunrajlab commented 11 months ago

Sure!

bruyeret commented 10 months ago

After more investigation, it is not because of a minimum number of characters, but because of the way text searches work. Mongo uses text indexes as explained here. Do you think there is a good way to change the UI?

arjunrajlab commented 10 months ago

I see! I think it should just say "Need more letters to match" if we are under the limit instead of "no results match query", which is confusing. Also, it might be good, if the "text match" option is selected, to have some dummy text in the search field that says Type at least 4 characters just to prompt the user.

bruyeret commented 10 months ago

The issue is not the number of letter For example, if I have a folder named "Many annotations", the search doesn't show anything for "annotat" but it finds the folder when I type "annot", "annotate", "annotation" or "annotations" Another example: to find "square", I can type "squared" or "squares", but to find "new", I can't search "news" That is why the language in the settings (default_language) is important as explained in the link I sent

arjunrajlab commented 10 months ago

Oh sorry, I didn't read that carefully! Hmm. I don't know what to make of that. The stems that we would be searching for may not be English at all, and will often be multiple words put together. Perhaps the best thing is to just drop the full text search for now? I don't know of a good way to explain this sort of strange behavior to the user. It seems strange to me that there is no substring search in Mongo, but whatever.

arjunrajlab commented 10 months ago

It seems you can do it with regex, but perhaps it is inefficient for very large collections? I don't think that would matter too much for us here, though.

https://stackoverflow.com/questions/10242501/how-to-find-a-substring-in-a-field-in-mongodb

manthey commented 10 months ago

Girder has two search modes by default; "text" uses the mongo text search, "prefix" has to match the beginning of the name of the thing being searched. It is intended to be extensible, so we could always create a new girder search mode that would be an arbitrary substring match (e.g., in Mongo a {$regex: } query on the name -- it wouldn't be as fast for huge collections.

arjunrajlab commented 10 months ago

I think let's just drop the text search for now unless it's a quick implementation. I think a regex would be fine given that we don't have a lot of items, but if it's going to be a fair amount of work to implement (sounds somewhat involved at least), then I would say let's drop it for the time being.

bruyeret commented 10 months ago

I don't think that it is a lot work On the frontend, setting the SearchModeOptions is very straightforward (a simple vue prop) On the backend, adding the new option would look almost exaclty the same way as the current "prefix" mode (see prefixSearch in model_base.py and addSearchMode in search.py

arjunrajlab commented 10 months ago

Cool, let's do it then!