Princeton-CDH / geniza

version 4.x of the Princeton Geniza Project
https://geniza.princeton.edu
Apache License 2.0
11 stars 2 forks source link

Full shelfmark search for multiple shelfmarks not working in admin #1498

Closed kseniaryzhova closed 6 months ago

kseniaryzhova commented 9 months ago

Describe the bug Entering more than one shelfmark into the admin search gives 0 results (even if both shelfmarks are confirmed to be in the database). This means we can't search by shelfmarks to form joins. Inputting the one shelfmark into the other's description will also only pull up one result in the shelfmark search, and not the second shelfmark.

To reproduce Steps to reproduce the behavior:

  1. Go to document search in admin
  2. Search for any two shelfmarks using AND or OR (ex: T-S 10J16.13 OR T-S 10J16.14)
  3. See error.
  4. Please note that if you search for any of the shelfmarks alone (T-S 10J16.13 for example) it will pop up correctly.

Expected behavior I should be able to enter more than one shelfmark at a time into the admin document search bar and receive the correct search results.

Device information Malfunctioning on both PC and Mac devices.

blms commented 7 months ago

@kseniaryzhova Testing to figure this out, it seems it may be because the OR operator only works between the two words that you place it between. In other words:

T-S 10J16.13 OR T-S 10J16.14

gets parsed as

T-S (10J16.13 OR T-S) 10J16.14

and since the default operator is AND when unspecified, that fully evaluates to:

T-S AND (10J16.13 OR T-S) AND 10J16.14
= T-S AND (T-S) AND 10J16.14
= T-S AND 10J16.14

That's why you're getting unexpected results; all this is really saying is "search for all documents with T-S and 10J16.14".

In that case, the correctly formed query would be:

T-S (10J16.13 OR 10J16.14)

which does result in the two expected results, plus the one that mentions one in the description.

OK to close since that seems to bring up the expected results?

kseniaryzhova commented 6 months ago

@blms Not closing because this solution only works for two T-S shelfmarks, but not if it's shelfmarks from more than one collection.

image
kseniaryzhova commented 6 months ago

And if you need both shelfmarks for a join and it's in different collections, then you still can't bring both together to merge them in admin.

blms commented 6 months ago

@kseniaryzhova I would write that query as (T-S 10J16.13) OR (BL OR 10656.16): https://geniza.princeton.edu/admin/corpus/document/?q=%28T-S+10J16.13%29+OR+%28BL+OR+10656.16%29

And if you need both shelfmarks for a join and it's in different collections, then you still can't bring both together to merge them in admin.

Does the (shelfmark) OR (shelfmark) format cover that case? Or if not, can you give an example?

kseniaryzhova commented 6 months ago

@blms no it does cover it, ok! I'll let Marina know! Closing!