apache / lucene

Apache Lucene open-source search software
https://lucene.apache.org/
Apache License 2.0
2.73k stars 1.05k forks source link

MLT queries ignore custom term frequencies [LUCENE-8756] #9801

Open asfimport opened 5 years ago

asfimport commented 5 years ago

The MLT queries ignore any custom term frequencies for the like-texts and uses a hard-coded frequency of 1 per occurrence. I have prepared a test-case to demonstrate the issue and a fix proposal https://github.com/ollik1/lucene-solr/commit/9dbbce2af26698cec1ac82a526d9cee60a880678


Migrated from LUCENE-8756 by Olli Kuonanoja, updated May 03 2019

asfimport commented 5 years ago

Olli Kuonanoja (migrated from JIRA)

Here is a PR for the issue https://github.com/apache/lucene-solr/pull/638

asfimport commented 5 years ago

Olli Kuonanoja (migrated from JIRA)

Related to #8905 @mikemccand what is your take on this?

asfimport commented 5 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

Ahh thanks for the ping Olli Kuonanoja I agree we need to fix this; I'll have a look at the PR, thanks!

asfimport commented 5 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

The change looks good – I left a couple minor comments – kinda freaky how Jira now tracks and posts how long I spend looking at a GitHub PR ;)  Thanks Olli Kuonanoja.

asfimport commented 5 years ago

Olli Kuonanoja (migrated from JIRA)

Thank you @mikemccand, applied review fixes to the PR

asfimport commented 5 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

Great, thanks Olli Kuonanoja – I'll push soon.

asfimport commented 5 years ago

ASF subversion and git services (migrated from JIRA)

Commit 4a76ad7263d8a112919fe007f19b71baafb169be in lucene-solr's branch refs/heads/master from Michael McCandless https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=4a76ad7

LUCENE-8756: add CHANGES entry

asfimport commented 5 years ago

David Smiley (@dsmiley) (migrated from JIRA)

Some commits were not reported because the commit message didn't mention the issue: https://github.com/apache/lucene-solr/commit/351e21f6203e8f3aece0cd5adf4049974bd2d636

BTW that commit now fails "ant precommit" although might not of at the time. Can you please fix this @mikemccand?

[forbidden-apis] Forbidden method invocation: java.lang.String#format(java.lang.String,java.lang.Object[]) [Uses default locale]
[forbidden-apis]   in org.apache.lucene.queries.mlt.TestMoreLikeThis (TestMoreLikeThis.java:497)

As an aside, I don't know why some devs like to call String.format in cases when simple/obvious string concatenation is equivalent.

asfimport commented 5 years ago

ASF subversion and git services (migrated from JIRA)

Commit 6842676952f15ee98c2ff9ef41b443a7134fa1b9 in lucene-solr's branch refs/heads/master from Christine Poerschke https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=6842676

LUCENE-8756: ant precommit (ant check-forbidden-apis) fix

asfimport commented 5 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

Ugh, sorry!  Thank you @cpoerschke!

asfimport commented 5 years ago

ASF subversion and git services (migrated from JIRA)

Commit 1b15f6e037a9ba663df0df0abcdca8476def6ea5 in lucene-solr's branch refs/heads/branch_8x from Michael McCandless https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=1b15f6e

LUCENE-8756: add CHANGES entry

asfimport commented 5 years ago

ASF subversion and git services (migrated from JIRA)

Commit c05501e5b279fad13f81279c341389ab7bebbff5 in lucene-solr's branch refs/heads/branch_8x from Michael McCandless https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c05501e

LUCENE-8756: fix precommit failure