Enhance case name search relevancy

legaltextai commented 3 weeks ago

"Google v Oracle" shows Oracle v Google Ideally, should show Google v Oracle, Supreme Court first, then Appeal, then organized by date There are other examples with famous cases. Should we assign more weight to the case name field? To the phrase? Recognize when a user searches for a case name (if there is a "v" or "v.")?

legaltextai commented 3 weeks ago

Interesting, even if I filter further by case name, it still does not show Google v Oracle in the first place

mlissner commented 3 weeks ago

We do recognize when there's a v. or an In re in the query and boost based on case name. Maybe we need to crank that boost higher.

I don't know that we can boost based on phrases, but @albertisfu might.

Two other relevancy enhancements we have planned are:

albertisfu commented 3 weeks ago

We do recognize when there's a v. or an In re in the query and boost based on case name. Maybe we need to crank that boost higher.

Yeah, this is correct. Currently, we boost the caseName field to 50 if there is a "v", "v.", "vs.", or "vs" within the query, or if it starts with "in re ", "matter of ", or "ex parte ".

After discussing this issue with @legaltextai, we think we could try to increase the boost on caseName and/or increase the query_string phrase component as well. This is because currently, matches in fields other than caseName are also influencing the scores.

This will require some testing within the production cluster so we can determine the best tuning for the search parameters. Perhaps this should also wait for https://github.com/freelawproject/infrastructure/issues/144 ?

mlissner commented 3 weeks ago

Yes, I'd suggest waiting. It'll be easier to do once y'all have read only access.

freelawproject / courtlistener

Enhance case name search relevancy #4366

558