UCSCLibrary / ucsc-library-digital-collections

A rails app based on Hyrax to be used as a repository for UCSC library digital collections.
1 stars 2 forks source link

Unexpected results when searching for known aerials images #684

Open rmjaffe opened 1 year ago

rmjaffe commented 1 year ago

Describe the bug

This past Monday a patron submitted a query wanting to access two aerial photographs and reported that they were unable to local them in the DAMS. In the attempt to fulfill the patron's request, Kerry and Rachel also failed to retrieve the sheets when searching (by title and filename) and browsing for them. Angelika was ultimately about to find them, but a couple things in particular about this experience are concerning:

  1. Browsing with the goal of retrieving a known sheet is hindered in that the sheets within a given flight do not display in order.
  2. When applying facets (in this case searching within a collection (i.e. the flight), search results did not yield the known sheets.

There could definitely be an element of user error in this, so will be curious to know what y'all find.

This is the conversation that was had on LibAnswers: Fwd: UCSC Library Aerial Photos Back Online Hi, All, ~ Forwarding a message that came into my email about aerials to see if you can check something for me. I am also not finding (in the DAMS) the images referenced below but the spreadsheets I have from DI and from the DAMS inventory indicate that the flight was in CDM and uploaded to the DAMS. Is it possible some images are hidden/a box not unchecked? Or, also, very possible, my search is lacking. thanks for your help, Kerry

---------- Forwarded message --------- From: --- Date: Sat, Sep 9, 2023 at 4:29 PM Subject: RE: UCSC Library Aerial Photos Back Online To:---

Hi Kerry,

I’m looking for photos 4R 145 and 146 in the 1956-C in the online air photo collection and cannot locate them. Index 29 indicates they exist.

I tried emailing the links in the email below and the addresses are no longer valid?

Can you help me locate these photos?

Thank you,

-- Message was forwarded to LibAnswers by "Kerry Scott" Internal Note from Rachel Jaffe (Sep 11 2023, 12:45PM):

Hi Kerry,

I'm looking into it now. Will let you know what I find.

Thanks,

Rachel

-- Emailed to: "Kerry Scott" Internal Note from Rachel Jaffe (Sep 11 2023, 01:02PM):

Alright -- I think what's at issue here is the DAMS' subpar search functionality. Those two images are two among hundreds of results yielded by a keyword search on the image titles but do not appear among four results presented when searching by filename. Huh? Can confirm that those images are included on the ingest spreadsheet:

1956c_abg_04r-145.tif 1956c_abg_04r-146.tif

If the patron is interested specifically in these two sheets, I think the easiest way to provide access would be to copy and share these two files from diginit2. If this makes sense to you both, Angelika, would you mind doing this for the patron?

Will also ask Sneha if there's an easier/more accurate way on her end to find known files. I don't see how these two could not be in the DAMS, but have no way to double check other than exporting or browsing.

Thanks,

Rachel

-- Emailed to: --- Internal Note from Angelika Frebert (Sep 11 2023, 02:25PM):

Sure, if Sneha doesn't know a way to find them in the DAMS, I could deliver the files via Google Drive.

Let me know.

-- Emailed to: "Rachel Jaffe" Internal Note from Kerry Scott (Sep 11 2023, 02:52PM):

So weird, okay, yes, this makes sense since the files are somewhere in there, just not discoverable. Thanks, Angelika for your support in pulling the files. I will follow up with the patron to let him know that there is a search retrieval glitch and he should expect a link to the files to come through Google while we ascertain the issue. Sound okay?

-- Emailed to: "Rachel Jaffe" Internal Note from Rachel Jaffe (Sep 11 2023, 02:56PM):

Sounds good to me. Thank you both :)

-- Emailed to: "Angelika Frebert" Internal Note from Kerry Scott (Sep 11 2023, 03:00PM):

This is the email I sent the patron from my email address.

Dear ---,

Thanks for letting us know that the files were not showing up and I am sorry the emails did not work for you (corrected email: ---)).

There is a glitch (your email brought it to our attention) with the search results that we are now looking into. In the interim, you should expect to see a link to a Google Drive file with the files you need from the flight.

Best,

Kerry Scott

-- Emailed to: "Angelika Frebert" Internal Note from Angelika Frebert (Sep 11 2023, 03:12PM):

I'll deliver per Google Drive, but I wanted to let you know that I tried a search myself and found the images straight away.

(Sorry I wasn't fast enough to change course with the patron, but I'm at SCA reference desk this afternoon which delayed looking into it)

I was able to retrieved the files straight away by putting the file name into the search field like this:

https://digitalcollections.library.ucsc.edu/catalog?utf8=%E2%9C%93&search_field=all_fields&q=1956c_abg_04r-145.tif

URL for 145:

https://digitalcollections.library.ucsc.edu/concern/works/r781wm23q?locale=en

URL for 146:

https://digitalcollections.library.ucsc.edu/concern/works/k643b513n?locale=en

-- Emailed to: --- Internal Note from Kerry Scott (Sep 11 2023, 03:14PM):

Okay, huh, thanks for finding them that way. No need to do the G-drive piece. Let's send him the links from the DAMS and not do the G-Drive thing.

I'll follow up with him to let him know we did a workaround discovery and here are the direct links.

-- Emailed to: "Angelika Frebert" Internal Note from Kerry Scott (Sep 11 2023, 03:17PM):

gave him the links you found, Angelika. thank you! Emailed to: "Angelika Frebert" Internal Note from Angelika Frebert (Sep 11 2023, 03:18PM):

Great!

Thanks, Kerry!

-- Emailed to: "Kerry Scott" Internal Note from Angelika Frebert (Sep 11 2023, 03:36PM):

FYI, I was wondering if it had to be the file name, and the answer is that search also worked for me putting the following in the search field:

"1956-C 4R 145" gave me

https://digitalcollections.library.ucsc.edu/catalog?utf8=%E2%9C%93&search_field=all_fields&q=1956-C+4R+145

First image in results is

https://digitalcollections.library.ucsc.edu/concern/works/r781wm23q?locale=en

So there are at least 2 different ways to get search results!

-- Emailed to: --- Internal Note from Kerry Scott (Sep 11 2023, 04:01PM):

Thank you, Angelika, glad it does not only need to be the file name. That would be rough. This is helpful,I will remember it if this comes up again.

-- Emailed to: "Angelika Frebert" Internal Note from Rachel Jaffe (Sep 11 2023, 04:52PM):

Thank you, Angelika. Your search skills put mine to shame! In looking back at what I was doing, I was searching for the filenames within the flight, 1956-C, which was not returning those images: https://digitalcollections.library.ucsc.edu/catalog?f%5Bancestor_collection_titles_ssim%5D%5B%5D=1956-C+Monterey+County+Flight+ABG&locale=en&q=1956c_abg_04r-145.tif&search_field=all_fields

Occam's razor holds true again!

-- Emailed to: --- Internal Note from Kerry Scott (Sep 11 2023, 05:16PM):

What's weird, too, is that I didn't search for a known title. Because i have experienced weird search results, I went to the flight and clicked through the pages. After the indices,the list of images for the flight begins with 10R and goes up from there, 1-9R don't show up on the pages, which is why i thought maybe a box was unchecked.

-- Emailed to: --- Internal Note from Angelika Frebert (Sep 11 2023, 06:12PM):

Search is definitely not working properly!

The take-away from this ticket might be that once we identify images that don't show while searching/browsing the flight, there's still a chance we can retrieve the file by searching for the specific image.

Expected behavior

When a specific title is searched for, only getting that title as a result.
When a title is searched for as a keyword, getting the exact/closest match first among the results. When a title is searched for as a keyword, getting the exact/closest match first among the results (this happens). When searching within a flight, getting relevant results.

rschwab commented 1 year ago

Edited to remove email addys and patron identification.

snehagunduraoUL commented 1 year ago

Hi @rmjaffe Here are the search results with screenshots

  1. Search with title across DAMS-

    Screen Shot 2023-09-13 at 1 26 54 PM

    Result :

    image
  2. Search with filename across DAMS

    Screen Shot 2023-09-13 at 1 29 05 PM

    Result :

    Screen Shot 2023-09-13 at 1 29 25 PM
  3. Search with title inside the collection: Search results list the works containing the title

    Screen Shot 2023-09-13 at 1 33 26 PM
  4. Search with filename inside the collection: Yes this is not working.

    Screen Shot 2023-09-13 at 1 34 58 PM

We can work on refining the search inside the collection, but other than that search in general seems to be working.

snehagunduraoUL commented 1 year ago

Out of the expected behavior list : Only getting a particular title that is searched for as result : This refined search has always been displaying the parent works and the list of their children works.

rschwab commented 1 year ago

I tried searching for "4R-145 1956-C" and the first result was correct, but clicking on it yielded this work that doesn't appear to be attached to the flight or the collection.

Same with 4R-146.

This is probably the issue with trying to browse for them within the flight (in addition to the known problem of sort order).

But I'm curious what search terms you used @rmjaffe? Do you recall?

rmjaffe commented 1 year ago

Weird when searching within the flight I used the filenames: 1956c_abg_04r-145.tif & 1956c_abg_04r-146.tif and 4R-145 & 4R-146 (as you did).

As far as the parent vs. child works appearing in search results, I know that our preference was for the parent work/image set only to display -- which makes sense when in the case of the aerials someone is entering a keyword search like "aerials watsonville". But, it is makes far less sense when a patron is searching for a known item/sheet using an exact title. It's too much to ask of patrons (and librarians, clearly) to identify the parent work/image set/flight as the "match", to select it, and to then know to either 1) browse to find the desired sheet (potentially through a 1,000 sheets) or to attempt to facet and search within the image set (which at present seems to be producing mixed results).

The aerials unlike the other collections in the DAMS both in terms of the size of the images sets/flights and how users want to be able to search or retrieve sheets. We'll probably want to think more about this collection; if we end up migrating to an environment in which we can have multiple front ends, we might want to separate this from the Special Collection content and have it be its own thing.

rschwab commented 1 year ago

These two works don't appear to belong to the collection, I think that's why they didn't show up when searching within the collection. A round trip should be able to correct that aspect of this issue.

rmjaffe commented 1 year ago

That's odd. Are there other works that aren't members of the collection? Is there anyway to check?

snehagunduraoUL commented 1 year ago

The way I can think of is to scroll down and check for missing works in the parent collection/work.

snehagunduraoUL commented 1 year ago

Checked the spreadsheet and these works exist there and have the right parent too. May be something went wrong when re-ingesting? Let me know if you would like me to roundtrip 1956-C. Thanks.

rmjaffe commented 10 months ago

@snehagunduraoUL Just confirming, I think we can close this one out correct?

snehagunduraoUL commented 10 months ago

@rmjaffe yes. This can be closed.