Closed mortenbouvet closed 1 year ago
Hi @mortenbouvet
Thanks for writing.
Looking at your code I have my suspicions. FuzzyMatch and WildcardMatch could easily give you more results (and some unexpected/unwanted) specially with multiple terms i.e. lett melk instead of merely lettmelk.
But for me to easily understand what the reason for these results are it would be great if you capture the JSON payload of the request done against the _search API endpoint. You should be able to accomplish this with Fiddler filtering for URLS with /_search or if you have the index name and a timestamp I could pull it from our logs.
Please give me some results you want to see and some that you see but don't expect.
Also I think it would suffice with a one directional synonym lett melk -> lettmelk I assume it's not two terms in your data or do you want to get hits for milk and lett when you search for lettmelk as well?
Hello @dada81
Thanks for the quick reply.
Index: seasservicegrossisteneas_seas01mstr9z6a4prep
I have made multiple search requests both with or without synonyms. Below is an overview of the timestamps and results.
I have also described the expected result for each search
Without synonyms:
Here we expect the two different searches to return different results. This works as expected and returns a very accurate result • «lett melk» o Tue, 26 Jul 2022 07:37:10 GMT o 14 hits
• «lettmelk» o Tue, 26 Jul 2022 07:39:47 GMT o 21 hits
With synomym «lett melk» > «lettmelk» (unidirectional)
Here we expect the search for «lett melk» to return the result for «lett melk» AND «lettmelk» resulting in a totalt of 35 hits (14 + 21). However the term «lett melk» returns 271 hits. From what i can see this is because the result returns products that contain the word «lett» OR «melk». Since we are using a unidirectional synonym the term «lettmelk» returns the expected result
• «lett melk» o Tue, 26 Jul 2022 07:42:23 GMT o 271 hits • Expected product - «MELK LETT 0,25L» • Unexpected product – «HAVREGRYN LETTK. 750G»
• «lettmelk» o Tue, 26 Jul 2022 07:43:06 GMT o 21 hits
With synonym «lett melk» <> «lettmelk» (bidirectional)
Here we see the exact same result as the unidirectional search. We expected this to return the same result for both search terms. So it seems as a bidirectional search might not be working. • «lett melk» o Tue, 26 Jul 2022 07:43:55 GMT o 271 hits
• «lettmelk» o Tue, 26 Jul 2022 07:44:28 GMT o 21 hits
Thank you.
Looks like we've having some issues with getting the full JSON for these search requests. They are currently truncated and not very useful. Until that is fixed if you could provide these requests with Fiddler locally it would be great.
Hi again, is it safe to upload the fiddler data on this public repository? My client is unsure about the sensitivity as its production data. Any other possible way for us to share this data with you?
Hi Morten
You can send it directly to me if that works.
Kind regards, Daniel
From: Morten Lensberg @.> Date: Thursday, 1 September 2022 at 13:29 To: episerver/EPiServer.Labs.Find.Toolbox @.> Cc: Daniel Dahlin @.>, Mention @.> Subject: Re: [episerver/EPiServer.Labs.Find.Toolbox] Searches with multi word synonyms produces an unexpected result when using UsingSynonymsImproved() (Issue #7)
Hi again, is it safe to upload the fiddler data on this public repository? My customer is unsure about the sensitivity as its production data. Any other possible way for us to share this data with you?
— Reply to this email directly, view it on GitHubhttps://github.com/episerver/EPiServer.Labs.Find.Toolbox/issues/7#issuecomment-1234149014, or unsubscribehttps://github.com/notifications/unsubscribe-auth/APSQ2OEHL3O4TQ7C6OTTYT3V4CHQXANCNFSM54H3C3LQ. You are receiving this because you were mentioned.Message ID: @.***>
Hello Daniel
We have provided request examples using fiddler as requested on https://github.com/episerver/EPiServer.Labs.Find.Toolbox/issues/7
While testing we discovered that this issue might not only be related to multi word synonyms, but also search terms that include a synonym. Example:
We added a synonym for "burger" <> "hamburger" (both unidirectional and bidirectional). We then search for "burger brød". Since we added hamburger as a synonym for burger, we expect the result to return products with (burger OR hamburger) AND brød as the result. However, it seems to us as the result returns products containing burger OR hamburger OR brød. Some products we do not expect to see here is "HAMBURGERKRYDDER 650G" and "POLARBRØD HVETE 16PK 600G".
/Morten
I believe I understand why this is happening. I think I have a solution for it. Will try to hand you a package for you test during next week.
Hi,
Sounds great, thanks!
Vh, Morten Lensberg
Fra: Daniel Dahlin @.> Sendt: Monday, September 12, 2022 10:14:54 AM Til: episerver/EPiServer.Labs.Find.Toolbox @.> Kopi: Morten Lensberg @.>; Mention @.> Emne: Re: [episerver/EPiServer.Labs.Find.Toolbox] Searches with multi word synonyms produces an unexpected result when using UsingSynonymsImproved() (Issue #7)
I believe I understand why this is happening. I think I have a solution for it. Will try to hand you a package for you test during next week.
— Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fepiserver%2FEPiServer.Labs.Find.Toolbox%2Fissues%2F7%23issuecomment-1243373577&data=05%7C01%7Cmorten.lensberg%40bouvet.no%7Ce987a7c41a354bdd41a408da9496e03c%7Cc317fa72b39344eaa87cea272e8d963d%7C1%7C0%7C637985672974084642%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=YHV5MUIAHPdPSVG9xAZLV%2B4DRRBEVZWZ8QVDUHT6uu8%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FATEINYYKAIBZDNSN2EAO6EDV53Q75ANCNFSM54H3C3LQ&data=05%7C01%7Cmorten.lensberg%40bouvet.no%7Ce987a7c41a354bdd41a408da9496e03c%7Cc317fa72b39344eaa87cea272e8d963d%7C1%7C0%7C637985672974084642%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=pcMpUqyqf3ctvYXAHnJv6mteyjzaA1JEFca%2Biq6nQgo%3D&reserved=0. You are receiving this because you were mentioned.Message ID: @.***>
Hi again,
Any news on the updated package?
Hi Morten,
Sorry for the lack of updates. It’s fixed but I need to do some more testing before I release it.
Ok, thanks for the update
Hi Morten
Please try 1.3.2 and let me know how it looks https://github.com/episerver/EPiServer.Labs.Find.Toolbox/blob/master/EPiServer.Labs.Find.Toolbox.1.3.2.nupkg
Thanks!, will try
Hello Daniel, Sorry for the late reply.
Version 1.3.2 works a lot better than before and produces more relevant hits when using multi word synonyms. We do however see some minor differences in the results if we use synonyms compared to not using synonyms. We also see some minor differences when using a bidirectional synonym and search for the term/phrase separately (these should in theory produce the same result?)
From the small sample of products I have tested it seems to somehow be related to the position of the word in our product names.
Example:
I have created a bidirectional synonym between “lett melk” and “lettmelk”.
If I search for “lett melk” I get the following result
If I search for “lettmelk” I get the following result
It also looks like there is some issue if the searchterm is a part of the productname. Example:
We have a product named “tinemelk lett”
If we search for “lett melk” without adding synonyms, this product is returned. However, if we add the synonyms from the previous example, this product is no longer returned.
In conclusion this update is a major improvement for us. The issues we have discovered so far are very minor, and we should probably work on updating the productdata on our end to have a more consistent naming convention as this will most likely solve the issues we are having.
Not sure if this is a bug/feature or us not understanding of how UsingSynonymsImproved() is supposed to work.
Describe the bug/Issue Searches with multi word synonyms produces an unexpected result when using UsingSynonymsImproved(). Usually it returns too many hits and hits that are not relevant compared to manually searching the two phrases/terms without adding synonyms
Our search code query = query.For(searchTerm, q => { q.Query = searchTerm; }) .WithAndAsDefaultOperator() .InField(x => x.ItemName) .AndInField(x => x.Code) .AndInField(x => x.MainCategoryName) .AndInField(x => x.SearchWords) .AndInField(x => x.PaidSearchWords) .AndInField(x => x.ItemPc1VendorsitemId) .AndInField(x => x.ItemTradeMarkName) .MinimumShouldMatch("2") .UsingSynonymsImproved(TimeSpan.Zero) .UsingRelevanceImproved(x => x.ItemName) .FuzzyMatch(x => x.ItemName) .WildcardMatch(x => x.ItemName);
Actual vs Expected Behavior A typical example would be searches for the phrases «lett melk» and «lettmelk». Searching these two terms individually will return 30 products (13 + 17). If we add a bidirectional synonym using the phrase «lett melk» and «lettmelk» as a synonym, we expect the result to return all 30 products regardless of what term is used. However searching for the term «lett melk» returns 187 products and the term «lettmelk» returns 17 products.
Using the standard UsingSynonyms() somewhat resolves this issue, however the standard UsingSynonyms() is unreliable with multi word phrases/synonyms and will sometimes cause the search to return an empty result
Additional information: Episerver.Find 13.4.8.0 Episerver.Labs.Find.Toolbox 1,3,1