Universal Viewer stops working after 2.8.0 --> 2.9.0 upgrade

samvera / hyrax

Hyrax is a Ruby on Rails Engine built by the Samvera community. Hyrax provides a foundation for creating many different digital repository applications.

http://hyrax.samvera.org/

Apache License 2.0

184 stars 124 forks source link

Universal Viewer stops working after 2.8.0 --> 2.9.0 upgrade #4572

Closed gwiedeman closed 3 years ago

gwiedeman commented 3 years ago

Descriptive summary

Universal Viewer appears to work just fine in Hyrax 2.8.0.
Changing the Hyrax version in the Gemfile to 2.9.0 and running bundle update hyrax causes Universal Viewer to
In looking at Troubleshooting RIIIF, it seems like iiif_info_url_builder is returning the wrong file ID and Fedora is returning an empty text file instead of the file object.

2.6.5 :015 > Riiif::Engine.routes.url_helpers.info_url(fs.files.first.id, host: "http://localhost:3000")
 => "http://localhost:3000/images/2f75rg51j%2Ffiles%2Fd1420cd8-a605-4ecd-8197-9e0f0862df5f/info.json"

This does return a manifest, but iiif_image_url_builder then returns an incorrect URL for the file:

2.6.5 :016 > Riiif::Engine.routes.url_helpers.image_url(fs.files.first.id, host: "http://localhost:3000", size: "100,")
 => "http://localhost:3000/images/2f75rg51j%2Ffiles%2Fd1420cd8-a605-4ecd-8197-9e0f0862df5f/full/100,/0/default.jpg"

This URL returns a 500.
From navigating the working University Viewer in 2.8.0, it looks like the correct file ID is 831e33b8-88c8-4922-af03-3cfa96c1f2af, not d1420cd8-a605-4ecd-8197-9e0f0862df5f. Changing that ID in the URL returned by iiif_info_url_builder also returns a manifest, and changing it in iiif_image_url_builder correctly returns the file.
So it seems that the 2.9.0 upgrade is correctly using @id from the manifest when the wrong @id in 2.8.0 did not matter.
It looks like iiif_info_url_builder is relying on fs.files.first.id to return the ID. This returns a Fedora URI with the wrong ID, which serves an empty .txt file. fs.files.last.id Returns the correct ID and Fedora serves the file. So it seems like all of our FileSets have a bad first file and I'm not sure if this is something that is supposed to happen or if this is something that happened in a previous Hyrax version, or if this is totally a local issue and we should just delete all these.

Rationale

2.8.0 to 2.9.0 should be a non-breaking update from what I can tell in the release notes.

Expected behavior

Universal Viewer displays the image.

Actual behavior

Steps to reproduce the behavior

Changing the Hyrax version in the Gemfile to 2.9.0
bundle update hyrax
Restart

Related work

Troubleshooting RIIIF was helpful.

no-reply commented 3 years ago

thanks @gwiedeman!

it seems like there are a couple of issues at play here. as i dig into this properly, here's some preliminary information:

Hyrax 2.9.0 includes a complete rewrite of the IIIF Manifest generation process. this rewrite is a substantial performance improvement on manifest generation, especially for Works with many FileSets. all the code that powered the old behavior is still in the release, and it can be reinstated by adding the following to controllers including Hyrax::WorksControllerBehavior:

def iiif_manifest_presenter
  presenter
end

one known change in the generation process is that we no longer use the Hyrax request context to determine the protocol to be used by the IIIF server. in practice what it means is that, for apps using Riiif and HTTPS, the protocol for Riiif URLs (as generated by Riiif::Engine.routes.url_helpers.image_url via Hyrax.config.iiif_image_url_builder) is dependent on Riiif::Engine configuration, not the protocol you're using to get the manifest. in theory, this ought to be an improvement, but it seems to be causing issues for some app deployments (notably https://nurax-dev.curationexperts.com). see, for example:

http://nurax-dev.curationexperts.com/images/p5547r38s%2Ffiles%2F358aab77-5df3-4224-82e3-47e01b8add99%2Ffcr:versions%2Fversion1/full/600,/0/default.jpg vs https://nurax-dev.curationexperts.com/images/p5547r38s%2Ffiles%2F358aab77-5df3-4224-82e3-47e01b8add99%2Ffcr:versions%2Fversion1/full/600,/0/default.jpg

if your Riiif is sitting behind a server enforcing SSL, putting this in your initializer may fix your problem Riiif::Engine.routes.default_url_options = { protocol: 'https' }

in retrospect, this change may have been a mistake (or maybe is too aggressive a behavioral change for a minor release). i'm considering ways to roll it back without losing the performance improvements referenced above. feedback would be valued.

no-reply commented 3 years ago

@gwiedeman your issue looks (at my best first guess) like it is due to neither of the above issues.

the file id for a given file_set shouldn't be calling #first as in file_set.files.first.id, instead, i think we should be looking for mismatches between the index and Fedora. for a FileSet that has this problem, can you check FileSet.find(id).original_file vs the indexed original_file_id in your solr instance?

i think in addition to the errant/extra text file (no idea where this might have come from, can you point me to your code base?), these objects must have a weird index state. the new manifest generation relies much more heavily on the index to understand which files to render.

that's the best i've got.

no-reply commented 3 years ago

after #4663, nurax-dev.curationexperts.com has working IIIF via UV again. i'm not closing this out at this point, since @gwiedeman's issue is still undiagnosed, but i think it's no longer a blocker for 3.0.0.

gwiedeman commented 3 years ago

Thanks for looking at this @no-reply !

Our codebase is here if that helps: https://github.com/UAlbanyArchives/hyrax-UAlbany

Its very possible this is an indexing issue. We also just updated from Solr 7 to 8.6 recently before updating to 2.9.0. fs.original_file returns the correct ID, and this is the same as the original_file_id shown in the console after fs.to_solr. However, when I query the FileSet in the Solr console I do not see an original_file_id key in the document as I would expect. This is the response:

"response":{"numFound":1,"start":0,"maxScore":16.77106,"numFoundExact":true,"docs":[
      {
        "system_create_dtsi":"2019-02-02T00:28:42Z",
        "system_modified_dtsi":"2019-02-02T00:28:54Z",
        "has_model_ssim":["FileSet"],
        "id":"2f75rg51j",
        "accessControl_ssim":["Not sure if this needs to be omitted"],
        "depositor_ssim":["importer@albany.edu"],
        "depositor_tesim":["importer@albany.edu"],
        "title_tesim":["nam_ua_390_pm_026037.jpg"],
        "date_uploaded_dtsi":"2019-02-02T00:28:42Z",
        "date_modified_dtsi":"2019-02-02T00:28:42Z",
        "creator_tesim":["importer@albany.edu"],
        "thumbnail_path_ss":"/downloads/2f75rg51j?file=thumbnail",
        "hasRelatedMediaFragment_ssim":["2f75rg51j"],
        "hasRelatedImage_ssim":["2f75rg51j"],
        "label_tesim":["nam_ua_390_pm_026037.jpg"],
        "label_ssi":"nam_ua_390_pm_026037.jpg",
        "file_format_tesim":["jpeg (JPEG File Interchange Format)"],
        "file_size_lts":19517703,
        "height_is":4840,
        "width_is":6586,
        "visibility_ssi":"open",
        "mime_type_ssi":"image/jpeg",
        "digest_ssim":["urn:sha1:bfe876228776081f1bf7c9c3d0b3c481e71863c5"],
        "original_checksum_tesim":["04400b96f8c5d5fc7c00e4688a7e226d"],
        "read_access_group_ssim":["public"],
        "edit_access_group_ssim":["content-admin"],
        "human_readable_type_tesim":["File"],
        "_version_":1624314830176387072,
        "timestamp":"2019-02-02T00:28:55.297Z",
        "score":16.77106}]
  },

It is weird that fs.files returns two Hydra::PCDM::File objects when there is only one file in the fileset? The other being the empty text file. Is possible that this is bad local data.

I don't suspect its an https issue, as I get the same behavior on our dev server without using ssl, but I'll look into this more.

Also, as for how aggressive to make changes for a minor release, I think its okay. It think its expected that you should test even minor releases and that's how we found the issue. We're really excited about the many FileSets performance upgrades and the barriers should be as low as possible for these types of improvements.

no-reply commented 3 years ago

thanks for the update @gwiedeman.

i agree that it's weird you're ending up with multiple files, and i'd like to dig into that more deeply with you.

focusing briefly on the index issue, i wonder if you could try fs.update_index fixes the issue for you? if to_solr has the original file, then i'd really expect it to be in the solr document, and it being missing feels like it's certainly at least part of the problem here.

on the other hand, we should be falling through to a Fedora lookup on #original_file in the case that the indexed original_file_id isn't present. do you have entries in your log like "original_file_id for [some id] not found, falling back to Fedora."?

gwiedeman commented 3 years ago

Ah, I forgot that .to_solr doesn't actually update the index. fs.update_index does add a original_file_id_ssi key in the index:

"original_file_id_ssi":"2f75rg51j/files/831e33b8-88c8-4922-af03-3cfa96c1f2af/fcr:versions/version1",

The ID in there is correct, and appending it to the Fedora URL does return a file from Fedora, but Universal Viewer still shows the same error in 2.9.0.

As for the rails logs, I do get that warning when loading an object that I didn't .update_index for, but the warning stops appearing after fs.update_index which seems to be as expected:

W, [2020-12-16T15:41:04.347069 #8597]  WARN -- : [37ca1658-3b7e-4bdd-b61c-d959d5cad771] original_file_id for nc580t73h not found, falling back to Fedora.

Apart from that, I do get this when I load the FileSet page, but that's it:

E, [2020-12-16T15:24:24.995426 #4954] ERROR -- : [3292abc1-487e-49bc-9b17-157ed1ca372e] unable to fetch event: FileSet:2f75rg51j:event

Even after turning up the log level

gwiedeman commented 3 years ago

I finally sorted this out when I noticed I was getting a CORS error on our dev server because the object on my.dev.server:3000/concern/images/6t053p39d was requesting the manifest at my.dev.server/images/2f75rg51j%2Ffiles%2F831e33b8-88c8-4922-af03-3cfa96c1f2af/info.json (without the port number).

This was odd as I was getting the same result on the production server, but that issue was due to SSL enforcement, so @no-reply 's suggestion of adding Riiif::Engine.routes.default_url_options = { protocol: 'https' } to config/initalizers/riiif.rb resolved it.

I'll leave this open, as I'm not sure if its worth trying to get the manifest to use rails_root or whatever to use the port number. I would probably recommend adding the riiif initalizer option to the 2.9.0 release notes, but I still think its totally fair for a minor release.

Thanks @no-reply for spending some time on this!

no-reply commented 3 years ago

thanks for the follow-up @gwiedeman.

based on this update, i think the patch in #4663 ought to fix this without need for the default_url_options override. i'll propose a 2.9.2 release later today which should prevent other folks from running into issues.

once that's out, we should be able to close this out.

no-reply commented 3 years ago

closing this as fixed by the release of 2.9.2