Language-Research-Technology / oni-ui

Oni Discovery Portal using Oni REST API
GNU General Public License v3.0
0 stars 3 forks source link

Differing results for Boolean operators #92

Open rosanna-smith opened 1 week ago

rosanna-smith commented 1 week ago

If this should go in auto-ldaca-data-portals let me know and I'll move it.

I'm putting together some training around advanced search in the portal, and noticed some differences in result count for boolean operators. I'm testing three types here: 1) two terms in a single search field without parentheses Screenshot 2024-06-26 at 3 04 47 pm 2) two terms in a single search field with parentheses Screenshot 2024-06-26 at 3 05 00 pm 3) two terms in separate search fields with the boolean operator. Screenshot 2024-06-26 at 3 04 26 pm

I'm including the searches and results I got from each below, together with the query output, which seems to be slightly different for each.

AND:

otway AND grampians: 0 results ( name.@value : otway AND grampians OR description.@value : otway AND grampians OR inLanguage.name.@value : otway AND grampians OR _text : otway AND grampians )

(otway AND grampians): 1 result ( name.@value : (otway AND grampians) OR description.@value : (otway AND grampians) OR inLanguage.name.@value : (otway AND grampians) OR _text : (otway AND grampians) )

Text 1: otway /AND/ Text 2: grampians: 1 result ( name.@value : otway OR description.@value : otway OR inLanguage.name.@value : otway OR _text : otway ) AND ( name.@value : grampians OR description.@value : grampians OR inLanguage.name.@value : grampians OR _text : grampians )

OR:

otway OR grampians: 9 results ( name.@value : otway OR grampians OR description.@value : otway OR grampians OR inLanguage.name.@value : otway OR grampians OR _text : otway OR grampians )

(otway OR grampians): 14 results ( name.@value : (otway OR grampians) OR description.@value : (otway OR grampians) OR inLanguage.name.@value : (otway OR grampians) OR _text : (otway OR grampians) )

Text 1: otway /OR/ Text 2: grampians: 14 results ( name.@value : otway OR description.@value : otway OR inLanguage.name.@value : otway OR _text : otway ) OR ( name.@value : grampians OR description.@value : grampians OR inLanguage.name.@value : grampians OR _text : grampians )

NOT:

otway NOT grampians: 9 results ( name.@value : otway NOT grampians OR description.@value : otway NOT grampians OR inLanguage.name.@value : otway NOT grampians OR _text : otway NOT grampians )

(otway NOT grampians): 8 results ( name.@value : (otway NOT grampians) OR description.@value : (otway NOT grampians) OR inLanguage.name.@value : (otway NOT grampians) OR _text : (otway NOT grampians) )

Text 1: otway /NOT/ Text 2: grampians: 8 results ( name.@value : otway OR description.@value : otway OR inLanguage.name.@value : otway OR _text : otway ) NOT ( name.@value : grampians OR description.@value : grampians OR inLanguage.name.@value : grampians OR _text : grampians )

Options 2 and 3 provide identical results, so it seems like it's booleans without parentheses that are producing different counts. The help says to use parentheses when multiple operators are used, but seems like it's required for a single operator as well.

moisbo commented 1 week ago

Hi Rosanna -- Have a look at what query is produced with the Show Query Button and then try to reproduce it with the API. If there is a difference then we should address it, otherwise I'd update the help page. I mean a difference with the results, because there might be a bug on the query generation or that is how opensearch works

  1. With postman create a query with the API
  2. Compare it to what is produced as a query
  3. Run the query in both