psu-libraries / scholarsphere-3

A web application for ingest, curation, search, and display of digital assets. Powered by Hydra technologies (Rails, Hydra-head, Blacklight, Solr, Fedora Commons, etc.)
Apache License 2.0
78 stars 24 forks source link

Move Binary Data outside of Fedora #1128

Closed carolyncole closed 5 years ago

carolyncole commented 6 years ago

Currently we store our binary data in the fedora repository. This lead to migration hassles and copying the data when we migrate. We would like to move binary files to a location on the file system, and then have fedora either point via metadata to them or serve up as a forward.

carolyncole commented 6 years ago

From what I can tell if we had a server to host the content from a directory we could use add_external_file_to_file_set to store our content externally and have fedora redirect: https://github.com/samvera/hydra-works/blob/d8969788d64638714eac5743704a9bf21319d8e2/spec/hydra/works/services/add_external_file_to_file_set_spec.rb

Right now I think we are hosting the thumbnails locally through passenger. We could think about doing something similar.

@jrpatterson @informaticianme Any thoughts on the level of effort to serve a directory of files from the repo server (maybe a network share)? We are looking to remove the binary content from Fedora.

awead commented 6 years ago

There was an PR to enable external files in ActiveFedora, but it was never merged: https://github.com/samvera/active_fedora/pull/1234 Perhaps that could be of some use here.

pketienne commented 6 years ago

The relevant documentation for external content for Fedora can be found here: https://wiki.duraspace.org/display/FEDORA471/External+Content

Here's a git issue from the Islandora community that seems to have a good discussion around external content with Fedora - specifically limitations of the current external content feature: https://github.com/Islandora-CLAW/CLAW/issues/564

carolyncole commented 6 years ago

2 patterns are to let

For Samvera only the second pattern is possibly easier For other things accessing you repository (bag export) would not work with option number two.

@little9 @bess will take a look and determine which pattern would be best for both PSU and the community.

carolyncole commented 6 years ago

We may include upgrading to to Fedora 4.7.5

carolyncole commented 6 years ago
carolyncole commented 6 years ago

https://docs.google.com/document/d/1jA43RCnZY8F-gBmXoDVXO4weGiJQvtUUZEBxFpQpJpQ/edit

Leaning towards option 2:

carolyncole commented 6 years ago

https://docs.google.com/document/d/1jA43RCnZY8F-gBmXoDVXO4weGiJQvtUUZEBxFpQpJpQ/edit

pketienne commented 6 years ago

Added jira tickets for likely infrastructure work to enable sharing of filesystem files via httpd: