Closed mvahowe closed 4 years ago
For the record, 5.1 is not a version, but a speaker configuration (5 satellite speakers, 1 subwoofer). I think there is a fairly standard set of common speaker configurations in existence. Audio APIs could be a source of possible values supported by hardware/operating systems, for example: https://fmod.com/resources/documentation-api?version=1.10&page=content/generated/FMOD_SPEAKERMODE.html https://github.com/openalext/openalext/wiki/AL_EXT_MCFORMATS
@rdb Thanks, that's helpful. I'm still not sure how fluid those lists are, and, also, whether we have audio that supports any of these configurations, or even how we'd know if they did or not.
(In implementation terms it would be extremely helpful if whatever information we need for audio metadata could be discovered using well-known open-source tools such as ffmpeg or sox.)
In implementation terms it would be extremely helpful if whatever information we need for audio metadata could be discovered using well-known open-source tools such as ffmpeg or sox.
Looks like this is easy with ffmpeg: https://superuser.com/questions/1106343/determine-video-bitrate-using-ffmpeg/1111039
@FoolRunning Yes, DBL's Nathanael already uses ffmpeg. My question is how much track information we can get from that. from the screenshots at your link the answer appears to be "numberOfChannels".
Ah. Maybe this is better, then: https://stackoverflow.com/questions/47905083/how-to-check-number-of-channels-in-my-audio-wav-file-using-ffmpeg-command/47905308
I think the channel_layout is what you need. Documentation for the values is here.
Thanks @FoolRunning.
My proposal is
Can we talk about this?
What I understood is one-channel (mono) or two-channels (stereo) only should make sense in dramatized audio (with music, voices and passing camels at Genesis 37), or audio with multiple speakers (when "Abraham" and "Sarah" are using two different microphones from a different position, or single voice with music (you can hear the voilin is more on the right and the speaker stays in front of it). This all gets lost in most of the daily use of audio on a smartphone.
Single voice (mono) is a big part of what we have now in DBL and we cannot trust if in DBL something is mentioned as "stereo", it is in reality multi-channeled. I think it should make sense to investigate some dramatized audio to check IF they have recorded it in stereo and HOW they did the recordings.
I think PR #177 implements most of this. I suggest that we leave the "configuration" fields as-is for now.
The current Scripture Burrito fields were inherited from DBL Metadata which may have been inherited from somewhere else. We'd like to convert the values to camelCase for the sake of consistency. More importantly, we'd like to end up with choices that make sense and which have clearly-documented meaning.
Audio Dramatization
The current options are
It seems to me, that logically, for example, a recording can be both dramatized and single-voice. I wonder if we have several concerns here, eg something like
Rather than having a single list, we could allow combination of tags such as
Migration of existing data would be a challenge, but I think it's tractable. (DBL might need a crowdsourced audio-listening party.)
Audio Track Information
The current options are
I don't recall where this list came from. (I think I may have been in the hotel lobby where it was created.) I can't find this list anywhere else, and it seems arbitrary to me.
One consideration is whether we can get this information from probing wav or mp3 files.
Mono/Stereo seems safe enough. Beyond that I don't know. I do suspect that if we include "5.1 Surround" we should also include lots of other versions of this and other configurations.
All input welcome!