Open arjunrajlab opened 1 year ago
This seems to come from MongoDB. When I open a MongoDB playground and do:
use('girder');
db.getCollection('folder').find({ $text: { $search: "squar" } });
I get a result with "square" in its name.
But when I replace "squar"
with "squa"
, I don't get a result...
Oh interesting. We can just leave it, not a big deal.
I can maybe improve the UI?
Sure!
After more investigation, it is not because of a minimum number of characters, but because of the way text searches work. Mongo uses text indexes as explained here. Do you think there is a good way to change the UI?
I see! I think it should just say "Need more letters to match" if we are under the limit instead of "no results match query", which is confusing. Also, it might be good, if the "text match" option is selected, to have some dummy text in the search field that says Type at least 4 characters just to prompt the user.
The issue is not the number of letter
For example, if I have a folder named "Many annotations", the search doesn't show anything for "annotat" but it finds the folder when I type "annot", "annotate", "annotation" or "annotations"
Another example: to find "square", I can type "squared" or "squares", but to find "new", I can't search "news"
That is why the language in the settings (default_language
) is important as explained in the link I sent
Oh sorry, I didn't read that carefully! Hmm. I don't know what to make of that. The stems that we would be searching for may not be English at all, and will often be multiple words put together. Perhaps the best thing is to just drop the full text search for now? I don't know of a good way to explain this sort of strange behavior to the user. It seems strange to me that there is no substring search in Mongo, but whatever.
It seems you can do it with regex, but perhaps it is inefficient for very large collections? I don't think that would matter too much for us here, though.
https://stackoverflow.com/questions/10242501/how-to-find-a-substring-in-a-field-in-mongodb
Girder has two search modes by default; "text" uses the mongo text search, "prefix" has to match the beginning of the name of the thing being searched. It is intended to be extensible, so we could always create a new girder search mode that would be an arbitrary substring match (e.g., in Mongo a {$regex:
I think let's just drop the text search for now unless it's a quick implementation. I think a regex would be fine given that we don't have a lot of items, but if it's going to be a fair amount of work to implement (sounds somewhat involved at least), then I would say let's drop it for the time being.
I don't think that it is a lot work
On the frontend, setting the SearchModeOptions
is very straightforward (a simple vue prop)
On the backend, adding the new option would look almost exaclty the same way as the current "prefix" mode (see prefixSearch
in model_base.py
and addSearchMode
in search.py
Cool, let's do it then!
Someone has requested this feature again :). I guess looking at the above PR we would need to make a plugin for this?
We can create a new plugin for this endpoint, but we can also add an endpoint to the existing plugin This could be pretty resource intensive as said by Zach in the closed PR:
it would require a full table scan
Can we first subset by the datasets in the folder that is being displayed to lower the computational cost? Or current folder and all subfolders? I think for any individual user, they are not expecting to search the entire database.
This is possible, but we should make it clear in the UI If we want to only search the current folder, this is pretty easy as we can filter the rows in the browser instead of querying the server and opening a pop-up for the results
I think that would be great. Yes, I think in the UI, we could make it clear it's just a filter rather than a comprehensive search. I think mostly people are looking for a local filter. For the more comprehensive search, then probably the current functionality is just fine.
Prefix search seems to work fine, but text search seems broken @bruyeret. See screenshot below. Somehow it is not finding the MS2 file that is in the directory (also visible in screenshot).
Not sure what happened, I had tested it before and it worked, but after the merge it seems to have stopped working.