Open shanaqui opened 6 years ago
Thanks for raising that. Is this a problem caused by the new "Load More" button? I've started having the same problem. I have a private challenge in my solo party which I use to add a bunch of weekly To-Dos so I don't have to recreate them from scratch each time. It's called "Dewines's weekly household tasks". I used to just go to Discover Challenges and put "Dewines" in the search box and it would come up. Now I get pages of "Daily focus finder" challenges and have to click Load More a couple of times before the one I want appears.
The current search is no longer useful. For example, you can search for an exact phrase from the title of a challenge and it can take numerous clicks of "Load More" for the challenge to appear. Many users won't realise they need to do this and so will never find the challenges they're looking for.
I think we need an API route that searches Challenge titles and returns all of the results at once, or at least more than 10 - it's agonising to step through them 10 at a time, especially with the expanded format that makes it hard to scan the titles rapidly.
I'll leave this as suggestion-discussion for a couple of days and then if there's no objection, I'll add an edit to the top post to describe that as the desired fix.
I don't think we need a new api route. The solution should be to modify the current query to make titles more focused. We can try removing description for example. But whoever works on this should look into mongo query search.
Note: I just searched for a challenge (Read the World April) with an exact phrase from the title ("read the world"). It took 11-12 "load more"s to find it, and returned things where literally the only matching text in the challenge was the word "the" and, possibly, "already", if it was matching "read" because it's in "already".
I'd be interested to see what removing "the" (and other common words) and partial matches (e.g. already/read) would do to help.
Removing stop words could help too, yea.
But if we are assuming they are searching title and removed the description query, I think that would have solved the problem too.
I've been experimenting on my local install with mongodb's built-in text indexes, which automatically use stop words (ignore "the", etc) and stemming (search for "dogs" and find "dog"), and which allows sorting the results by score. Weights can be added to the index if more than one field is indexed. Based on my tests on a limited set of challenges, it seems promising and easy to implement.
Documentation here: https://docs.mongodb.com/manual/core/index-text/ (note that's for 3.6; I couldn't easily find docs for 3.4 but that page doesn't indicate any differences between 3.4 and 3.6). A simple non-official tutorial: is here https://code.tutsplus.com/tutorials/full-text-search-in-mongodb--cms-24835
To allow searching on challenge's names and summaries, we'd create an index like this:
db.challenges.createIndex({name:"text",summary:"text"},{"weights":{name:3,summary:1}})
which would produce this:
{
"v" : 2,
"key" : {
"_fts" : "text",
"_ftsx" : 1
},
"name" : "name_text_summary_text",
"ns" : "habitrpg.challenges",
"weights" : {
"name" : 3,
"summary" : 1
},
"default_language" : "english",
"language_override" : "language",
"textIndexVersion" : 3
}
That weights the challenge names three times more than the summaries. We could play around with that number. My feeling in general is that searching on both with names highly weighted will produce the best results.
Search queries are like this:
db.challenges.find({$text: {$search: "keywords here"}}, {score: {$meta: "textScore"}}).sort({score:{$meta:"textScore"}})
If we wanted to implement a phrase search, we'd do that like this:
db.challenges.find({$text: {$search: "\"my phrase goes here\""}}, {score: {$meta: "textScore"}}).sort({score:{$meta:"textScore"}})
but I don't think a phrase search would be necessary because the scoring of keywords would allow exact-phrase matches to be near the top of the results anyway (confirmed in my local install tests), and my opinion (not based on any investigation) is that most searches are more likely to be for keywords than for phrases.
I think it's worth looking into this further. Perhaps we can set up a text index on the beta site and push a code change there? That would let us do real-life searches on the beta database which is an old-ish version of production and would give us an idea of whether it's working. In fact, we could simply add the index to the beta database and then use direct mongodb commands from a local script to do some initial testing, with no need to push any code changes to the beta website. If there's no objections from the staff in a few days, I'll create the index on the beta database and continue my testing there. @paglias @TheHollidayInn @SabreCat
If this works for challenges, it could also be used for guilds: https://github.com/HabitRPG/habitica/issues/9755
That looks good to me @Alys , the only thing is that I remember we're already using a text index but maybe not the $text query? cc @TheHollidayInn
Ah yes, in the prod database, we have this text index for groups (no text index for challenges):
{
"v" : 1,
"key" : {
"_fts" : "text",
"_ftsx" : 1
},
"name" : "name_text_description_text_summary_text",
"ns" : "habitica.groups",
"background" : true,
"weights" : {
"description" : 1,
"name" : 1,
"summary" : 1
},
"default_language" : "english",
"language_override" : "language",
"textIndexVersion" : 3
}
That groups index wasn't on beta but I've now done these two commands on the beta database (took just a few seconds each to run btw):
db.challenges.createIndex({name:"text",summary:"text"},{"weights":{name:3,summary:1}})
db.groups.createIndex({name:"text",summary:"text"},{"weights":{name:3,summary:1}})
Doing command-line searches on the beta database and comparing the results to searches in the beta website shows that using these indexes is a great improvement (e.g., for groups, a command-line search for "report a bug" actually puts the Report a Bug guild at the top of the results), so I think we should try to incorporate a $text search into the guild and challenge filter tool, instead of the current keyword search.
The looks good to me, are you going to create a PR with the query changes? cc @TheHollidayInn
I'm hoping to have time to play around with the code in the next day or two, but I'm not certain enough to claim the issue yet. :) It should stay marked as help wanted in case someone beats me to it.
So this would be my proposed change if I did this. I'm still not necessarily claiming this issue, just getting it a step closer. :) Does this seem right?
Replace these lines:
https://github.com/HabitRPG/habitica/blob/d4d668f640afea0b1fb09bf2a3c1cc6425e49e76/website/server/controllers/api-v3/challenges.js#L384-L389
with code based on this:
{$text: {$search: "search terms"}}, {score: {$meta: "textScore"}}
(still using $and
to keep the other filter options of course)
And replace this:
https://github.com/HabitRPG/habitica/blob/d4d668f640afea0b1fb09bf2a3c1cc6425e49e76/website/server/controllers/api-v3/challenges.js#L398
with: .sort({score:{$meta:"textScore"}})
if and only if req.query.search
is defined (otherwise leave the sort as it is now).
And replace this code:
https://github.com/HabitRPG/habitica/blob/d4d668f640afea0b1fb09bf2a3c1cc6425e49e76/website/client/components/challenges/myChallenges.vue#L132-L135
with a call to api.getUserChallenges
yeah that seems the right path
Um... I have an admission to make. I have just realised today that the reason the Daily Focus Finder challenges come up when I search on my name is that, unbeknownst to me, @blakejones99 had mentioned my name in the description. That doesn't explain Shanaqui's problem, but I'm sorry if you've been on a wild goose chase over me not being able to bring up my challenge right away when I searched on "Dewines".
@Dewines no worries but thanks for letting us know. :) When I was testing, I was using other challenges so that didn't make any difference. The search definitely can be improved.
Should these changes be applied to groups search as well?
@paglias Yes. This is the issue for it: https://github.com/HabitRPG/habitica/issues/9755
I'm going to take a look at this, per the suggestion from @paglias here. I'll shout if I get stuck. :)
@bigsee Thanks! I've marked it as in progress for you
Quick update: just got back from holidays. I'm still on this. Will likely be spending some more time this weekend.
@bigsee Hi! Still planning on working on this one? No hurry, we just check in to make sure that issues don't go stale. Please let us know within a week if you'd like to keep working on this, or we'll put it back in the queue (but you can always pick it up again in future). :)
Same here: since I didn't get a reply and I can't spot any activity, I'm putting this back as help wanted again, but if you'd like to pick it up again, just let us know, @bigsee!
hey @shanaqui (and team) - thank you for the nudge and really sorry about this. I've been underwater for a few weeks and started a new job. Just getting to clear emails now and this notification was perfectly timed.
Sorry to mess you folks around. Please consider me out of action for the time being but I'll definitely be back to help out once things settle down... 🙏🏽
From Report a Bug:
I get nothing if I enter "pom party" with quotes, and if I do e.g. "books" I get challenges where "books" isn't even included anywhere in the description or even tasks._