Princeton-CDH / geniza

version 4.x of the Princeton Geniza Project
https://geniza.princeton.edu
Apache License 2.0
11 stars 2 forks source link

As a front-end user, I want to use fields in my keyword searches so I can make my searches more specific and targeted. #494

Closed rlskoeser closed 2 years ago

rlskoeser commented 2 years ago

Enable field searching like we have on PPA based on the fields we already have indexed.

testing notes

In the keyword search box on the public site, you will be able to use fields to search within specific content, using syntax like field:text or field:"exact phrase" — these can be combined with other search fields or booleans like AND, NOT (default is OR)

these are the fields that are currently available:

Note: the same logic should also work in the admin document search.

revisions after testing:

rlskoeser commented 2 years ago

question: we use plurals on tags and old pgpids because it makes sense to us in the code (since they can be multiple), but would singular make more sense for someone searching?

richmanrachel commented 2 years ago

@rlskoeser - I just searched "pgpid: 1259" and while the first result is indeed that document, there were two other results pulled that had a date of 1259 CE in the description. Is this expected behavior?

rlskoeser commented 2 years ago

@richmanrachel you need to remove the space between the colon and the value you want to search on in that field

richmanrachel commented 2 years ago

@rlskoeser - great! That field works now. I'll test the rest after lunch :)

richmanrachel commented 2 years ago

Also, @rlskoeser, I forget - isn't the first line of the transcription always supposed to show in the search results? It's not currently, unless the search term is in the transcription.

rlskoeser commented 2 years ago

@richmanrachel thanks for the careful testing & feedback.

As usual, I get myself into trouble with these "extra" features that I throw in because there's always more involved than I think in getting things fully working! I suggest that we consider fields that aren't working well and aren't needed immediately to be not officially supported for now. We can revisit later on, and should probably consider in tandem with the search filters that Gissoo has designed which decided were not MVP.

field specific responses:

Questions:

Thanks for flagging the missing transcription, you're right that it should always show if there is one. I'll check on that.

richmanrachel commented 2 years ago

@rlskoeser - great, thanks!

Do you expect that terms used in advanced search fields should not be used to highlight keywords in context in transcription & description?

  • Given the state of our tags, it probably does make sense to keep the highlighted keywords from elsewhere.

Do you prefer old_pgpid and tag to old_pgpids and tags ?

  • Yes, singular is much more intuitive since you're only searching for one at a time.
richmanrachel commented 2 years ago

As for tags not working, it just seems odd that the first result isn't the only document tagged with this title: image And for Marina, she was searching tags:"state" and it only comes up with descriptions with the word, rather than the tags as well.

rlskoeser commented 2 years ago

Thanks for the examples of problematic tag searches, will investigate!

rlskoeser commented 2 years ago

@richmanrachel I have made some changes based on your first round testing feedback, please test the following:

I couldn't duplicate the problems you were having with tags; I wonder if the Solr index was out of date when you were testing before, although I'm not sure if that would cause the problems you identified.

One thing that you should be aware of, which does apply with the search on tag:state is that at some point, we made a change so that we do not display all the tags on the search results page. I checked the logic: we are alphabetizing and then showing the first five (unfortunately, we don't even display an indicator that there are more tags than the ones being shown! We should at least do that). In the case of the tag "state", there are documents with numerous tags where state is later in the alphabet, so it doesn't show up on the list view — but for everyone I looked at, when I clicked into the document details it did have that tag.

This makes me think we may need to revisit limiting the tags on the search page... it seems pretty problematic that you can search by tag and not see the tag when it does actually match.

richmanrachel commented 2 years ago

@rlskoeser - everything works!

But yes, I agree we need to revisit the limited tags. Is there a way to make it work like the transcription text, so that the first 5 display automatically but if a search term includes part of the text, the search results show that part of the transcription?

rlskoeser commented 2 years ago

@richmanrachel I'm not sure of an easy way to display tags that match the search, but will think about it. Is 5 enough by default or should we increase? I think we could add something like (+ N more tags) when we don't display all.

richmanrachel commented 2 years ago

@rlskoeser - it's hard to say what a good number is, because some docs have so many and others have none. But I'm okay with a +N more tags idea!