clamsproject / mmif-visualizer

A web site to visualize MultiMedia Interchange Format json
Apache License 2.0
2 stars 1 forks source link

make src media file symlinking "just works" #31

Closed keighrim closed 5 months ago

keighrim commented 5 months ago

New Feature Summary

Reading the documentation and re-thinking how the (rather hasty and hack) initial implementation of linkage of media files into the flask "static" directory for serving video/audio/text over generated HTML page, I once again realized it could be very frustrating experiment for users to

  1. understand symlinking
  2. understand differences between container file system and local (mounted or volumed) file system
  3. understand file:// (or other location schema) path in MMIF and how it's related to local FS and/or container FS
  4. carefully creating correct symlinks (running directly) or carefully mount local file FS (running as a container) with the distorted path prefix (following document locations inside input MMIF files)

As briefly mentioned in #25, we can at least hide symlink part of the problem by dynamic linking. Here's implementation suggestion:

# input MMIF file is read and then 
for document in mmif.documents: 
    dpath = document.location_path()
    spath = static_folder / visualization_id / f"{document.id}.some_ext"
    os.symlink(dpath, spath)
    rel_spath = '/' + (spath - static_folder)
    # then continue generating HTML snippets, using `rel_spath` for `src` values in HTML tags
    ... 

This way,

  1. we don't have to expose the source file path in the MMIF to the webpage (might not be super significant security issue anyway)
  2. dpath (real FS path) is always normalized into static/some_viz_di/doc_id.mp4 (for videos). This can be a big security improvement if the dpath is very long complicated pathname
  3. users don't have to worry anything about static/data symlinks before running the app
  4. when running as a container, symlinks are dynamically generated, so containerfile no longer needs to "hard-code" static/data symlink inside the image

Related

No response

Alternatives

No response

Additional context

No response

keighrim commented 5 months ago

Another gain with this is that when the document path is not file:// scheme, as long as the mmif-python (or mmif-docloc-) can resolve the local FS path of the document, it will "just work" without additional handling.