Islandora / documentation

Contains islandora's documentation and main issue queue.
MIT License
104 stars 71 forks source link

Really Large Files (potential use case for External Content) #1240

Open rosiel opened 5 years ago

rosiel commented 5 years ago

How can we deal with files that are too large to add to Islandora "the usual way" (which is uploading files to Media using a web browser)? We're thinking Terabyte-sized files, that we want to preserve using Fedora (or, at least, Islandora).

Currently (after https://github.com/Islandora-CLAW/CLAW/issues/564) the workflow for using Fedora with external content is to add a file to a Media type that has its file field configured to use an external service (AWS, Dropbox, etc) for storage (thanks to flysystem).

Following up on @mjordan 's comment, it seems there's not yet a way to point Drupal to an external file, you have to upload something.

seth-shaw-unlv commented 5 years ago

I don't know about TB (they mention several GB), but the plupload_widget Drupal module can let us upload things bigger than the PHP max file size limit.

Of course, if you already have the large file where you want it on the web, then we will probably need to build something. Perhaps a file select widget that uses a configured Flysystem? Or perhaps we should look at the Remote Video media type for inspiration....

dannylamb commented 5 years ago

If you have a hard drive with TB sized files, you can "mount" it with flysystem, and then traverse the filesystem and make File+Media entities with a little loop script. I've done it in the past to prove "migrating in place" when you don't want to actually transfer the files.

mjordan commented 4 years ago

Reviving this issue, since In our current IR, we have resources that do not have local binaries, they point to external files, and there is a related thread in Slack. I poke around for some contrib modules that would help us use external files (I realize that this approach is distinct from uploading files via Drupal to make them "local") and found https://www.drupal.org/project/filefield_sources. Unfortunately the submodule that enables defining remote files (https://www.drupal.org/project/remote_file_source) has not even been ported to D8.

Over in that Slack thread @dannylamb said "make a new file entity, set the file uri to fedora://path/to/the/file and then save it". Expressed that way, this seems pretty straight forward. Should we be looking at figuring out how to make this possible from the perspective of a content manager using the existing Islandora Media UI tools (but handle both a file in Fedora and files not in Fedora)? In other words, instead of uploading a file, they would paste in the URL of the remote file?

dannylamb commented 4 years ago

Here's a gist for "migrating in place" like I mentioned earlier: https://gist.github.com/dannylamb/48b4f7284e11e9df05e95cb3625f9c92

It's all predicated on configuring flysystem to point at the external files. So stuff like a hard drive, or an FTP server, or anything accessible over HTTP. All that's fine. Something like youtube, soundcloud, etc... would require writing a custom adapter though.