Closed rlskoeser closed 2 years ago
question: we use plurals on tags and old pgpids because it makes sense to us in the code (since they can be multiple), but would singular make more sense for someone searching?
@rlskoeser - I just searched "pgpid: 1259" and while the first result is indeed that document, there were two other results pulled that had a date of 1259 CE in the description. Is this expected behavior?
@richmanrachel you need to remove the space between the colon and the value you want to search on in that field
@rlskoeser - great! That field works now. I'll test the rest after lunch :)
pgpid: Works
old_pgpids: Doesn't work as advanced search. I used the example Alan thankfully captured in Slack a while ago (old: 6786, new: 11317), and the string "old_pgpid:6786" had no results. Luckily, it does come up when you just search 6786, though.
shelfmark: Works when you add parentheses for precise term.
collection: does not work. (I tried ENA, JTS, British_Library, Penn... which type of string am I supposed to use? I tried to do multiple kinds but none worked).
description: works
transcription: works
tags: doesn't work
language_code: unsure. I could test for Hebrew ("language_code:he") and Arabic "...:ar" but how do I look for Judaeo-Arabic? What are the codes for all of our non-standard languages? Just a heads up that it's pulling in the descriptions too, for example here with ar:
input_year: works
Also, @rlskoeser, I forget - isn't the first line of the transcription always supposed to show in the search results? It's not currently, unless the search term is in the transcription.
@richmanrachel thanks for the careful testing & feedback.
As usual, I get myself into trouble with these "extra" features that I throw in because there's always more involved than I think in getting things fully working! I suggest that we consider fields that aren't working well and aren't needed immediately to be not officially supported for now. We can revisit later on, and should probably consider in tandem with the search filters that Gissoo has designed which decided were not MVP.
field specific responses:
JTS
, BL
, CUL, T-S
, AIU
. It seems like this one isn't useful right now, but also not needed since it overlaps with shelfmark search. Suggest we drop this one.tags:"bill of sale"
or tags:slave
Languages + Script
model, which is viewable & editable in admin, and I populated it for the languages & scripts that are associated with documents that have transcriptions. Judeo-Arabic code is jrb
. However, I think this one probably isn't very useful right now and I suggest we drop it from the list of supported advanced search fields.Questions:
old_pgpid
and tag
to old_pgpids
and tags
?Thanks for flagging the missing transcription, you're right that it should always show if there is one. I'll check on that.
@rlskoeser - great, thanks!
Do you expect that terms used in advanced search fields should not be used to highlight keywords in context in transcription & description?
- Given the state of our tags, it probably does make sense to keep the highlighted keywords from elsewhere.
Do you prefer old_pgpid and tag to old_pgpids and tags ?
- Yes, singular is much more intuitive since you're only searching for one at a time.
As for tags not working, it just seems odd that the first result isn't the only document tagged with this title: And for Marina, she was searching tags:"state" and it only comes up with descriptions with the word, rather than the tags as well.
Thanks for the examples of problematic tag searches, will investigate!
@richmanrachel I have made some changes based on your first round testing feedback, please test the following:
I couldn't duplicate the problems you were having with tags; I wonder if the Solr index was out of date when you were testing before, although I'm not sure if that would cause the problems you identified.
One thing that you should be aware of, which does apply with the search on tag:state
is that at some point, we made a change so that we do not display all the tags on the search results page. I checked the logic: we are alphabetizing and then showing the first five (unfortunately, we don't even display an indicator that there are more tags than the ones being shown! We should at least do that). In the case of the tag "state", there are documents with numerous tags where state is later in the alphabet, so it doesn't show up on the list view — but for everyone I looked at, when I clicked into the document details it did have that tag.
This makes me think we may need to revisit limiting the tags on the search page... it seems pretty problematic that you can search by tag and not see the tag when it does actually match.
@rlskoeser - everything works!
But yes, I agree we need to revisit the limited tags. Is there a way to make it work like the transcription text, so that the first 5 display automatically but if a search term includes part of the text, the search results show that part of the transcription?
@richmanrachel I'm not sure of an easy way to display tags that match the search, but will think about it. Is 5 enough by default or should we increase? I think we could add something like (+ N more tags) when we don't display all.
@rlskoeser - it's hard to say what a good number is, because some docs have so many and others have none. But I'm okay with a +N more tags idea!
Enable field searching like we have on PPA based on the fields we already have indexed.
testing notes
In the keyword search box on the public site, you will be able to use fields to search within specific content, using syntax like
field:text
orfield:"exact phrase"
— these can be combined with other search fields or booleans like AND, NOT (default is OR)these are the fields that are currently available:
Note: the same logic should also work in the admin document search.
revisions after testing: