Closed gamburgm closed 4 years ago
PR should be ready now, just need to do some manual testing and check that backend search speed doesn't take a massive hit.
Looks great! Great work
Also - this can be the last PR to this repo! 🤯 Use github.com/sandboxnu/searchneu from here on out 🥳
Also - this can be the last PR to this repo! 🤯 Use github.com/sandboxnu/searchneu from here on out 🥳
Actually gonna move this guy to the sandboxnu repo. 😄
This change fixes multiple issues:
the query bug where
cs2501
returns different results fromcs 2501
. The reason this was happening is that thecourse_code
analyzer was actually working, but because thename
field doesn't have theword_delimiter
filter, it parses the course code as a single token. Turns out thecs
token in the query actually givescs2501
a significant boost because it has the tokenCS
in its title, which is the only reason it was showing up as first to begin with. Not good.removes the hackiness of the function score. It was only working by a delicate balance anyway, with a 0.4 factor on non-primary courses and small changes in boosts for different fields to get the numbers just right. Removes that entirely.
making more intelligent queries altogether. If we know somebody's making a course code query, we should be doing two things: a. only query against
subject
andclassId
. b. the following results should all be of the same subject rather than having the same classId. For example, if you search forcs2500
, you should get fundies 1 as the first result, then CS classes as the rest of your search results. GettingENGL2500
would be very weird.P.S. a few last things necessary before this goes out plus some general tests to throw in as well.