Audio file support master ticket

jrochkind commented 5 years ago

We are going to add support for audio file ingest.

The current plan is for files to be ingested into Digital Collections app as either FLAC or MP3 (usually FLAC, MP3 for some legacy files with MP3 originals).

We plan to provide an HTML5 audio element to play files in browser. (At least for first implementation, simplest thing that might work).

Different browsers support different file formats for playing in browser. According to my research, WebM seems to be the currently preferred format supported by modern browsers. That plus MP3 seems to hit almost any browser still in use.

Note that for FLAC originals, either of these would be a derivative.
Even if original was MP3, in the sufia architecture where originals are delivered by the Rails app proxying fedora -- I'm not sure this will provide a good experience as an HTML5 audio source, without bringing down our apps. Using derivatives which are delivered directly from S3 is safer all around.
We may choose to deliver just WebM to hit almost everyone while saving the extra space and time of also having an MP3, or we may have an MP3 too. Need to do more research and consult with stakeholders, exact derivatives we need are not yet fixed.
Later we may move to HLS format. But that's somewhat more complicated, and not natively supported by many browsers (can be played with JS support like mediaelement.js and possibly HLS.js, which I believe Avalon uses both. (HLS is probably more important/beneficial for video than audio. Many/most people supporting HLS also use a "media server" to simplify serving HLS, serving HLS is a lot more complex). For now, we're going to start with HTML5 and the simpler audio formats it supports.
- (or dash format? don't know much about that)

We also need to decide if we want to offer additional derivative types for download. Like offer someone a download option of an MP3 version even if the original was FLAC? If we're going to make and store an MP3 as a derivative for dl anyway, obviously might as well supply it as an alternate format to html5 audio tag.

Order of implementation

Get it so you can add a FLAC or MP3 audio file to the Sufia app at all.
- Right now trying to upload one will possibly result in errors, as ingest-related code that assumes everything is an image might try to treat the audio file as an image, and raise.
- Make sure that doesn't happen. Start with no derivatives being created for audio files.
- This step is done when you can complete the steps to ingest a file with an audio file, and then go look at that audio file in the front-end UI, all without getting any 500s. You can "download original" for audio file, and get the original. At this step, in the space for a thumbnail, it may be blank or a missing image placeholder, that's fine or expected.
Add derivatives creation for audio files.
- Start with WebM. Maybe MP3 too.
- Not sure the best way to convert a FLAC or MP3 to these other elements. Perhaps ffmpeg command line? May need to make sure ansible is installing ffmpeg on deploy machines, if it's not already.
- We need to consider appropriate bitrate and compression level for derivatives. Assuming human conversation rather than music? We want to minimize file size without noticeable drop in perceived quality. 64 bps may be enough. For MP3s, "V2" compression level is probably good, equivalent in WebM-ogg.
- This step is done when the derivatives have been created and exist, they aren't yet showing up in any UI yet.

3.Now we need an actual UI that allows playability and downloadability

Let's eddie and jonathan talk before implementing. But basically.
For first run-through, we're not going to change where things are on the page, but the component on the page that displays a "thumbnail" has to be changed to deal with audio. But there's still one component on the page per fileset, with functionality only about that fileset.
On search results page, it should probably display a generic "audio" icon (maybe use SVG).
On item detail page, it should display an HTML5 audio player for an audio file in the current "thumbnail" spot. (Initially, don't worry about the fact that it's gonna be too small in the existing CSS for the thumbnail spot, just get it on there.) It should also display the staff-editable title/label.
It should still have a download options menu, but display appropriate download options for audio. The original, and whatever derivatives we decide we want to offer for download.

We'll presumably have another round of front-end UX tweaks after we have a basic demo as per 3 above, premature to spec that now.

1-3 should be done as separate PR's. (or even more individual PR's than that is okay too).

New App

We need to do these things in parallel in New App. At least we need to do 1 and 2 above, we don't have a 3 in new app yet really. @eddierubeiz I'd like to propose trying out a workflow this time where you do 1 in old app, then do 1 in new app. Then 2 in old app, then 2 in new app. Instead of 'finishing' old app first then moving to new app. This will ensure that we don't have the option of neglecting new app, the feature is not done until it's in both. And should help minimize "context switching" cost.

sanfordd commented 5 years ago

Quick note, WebM is a container that supports at least 2 different audio encoding standards (Vorbus and Opus I think) and we'll want to decide which encoding standard to use.

jrochkind commented 5 years ago

Some Audio/visual oral history presentation examples/models, all using the "Oral History Metadata Synchronizer" (OHMS), that we are not currently using. (Thanks @yonyitz)

More OHMS examples

sciencehistory / chf-sufia

Audio file support master ticket #1231

Order of implementation

New App