mvahowe commented 4 years ago

The current Scripture Burrito fields were inherited from DBL Metadata which may have been inherited from somewhere else. We'd like to convert the values to camelCase for the sake of consistency. More importantly, we'd like to end up with choices that make sense and which have clearly-documented meaning.

Audio Dramatization

The current options are

Dramatized
Non-Dramatized
Single-Voice

It seems to me, that logically, for example, a recording can be both dramatized and single-voice. I wonder if we have several concerns here, eg something like

singleVoice or multipleVoices
reading or dramatization
optionally, includesMusic and/or includesEffects

Rather than having a single list, we could allow combination of tags such as

singleVoice reading includesMusic

multipleVoice reading

multipleVoice dramatization includesEffects includesMusic

Migration of existing data would be a challenge, but I think it's tractable. (DBL might need a crowdsourced audio-listening party.)

Audio Track Information

The current options are

1/0 (Mono)
Dual mono
2/0 (Stereo)
5.1 Surround

I don't recall where this list came from. (I think I may have been in the hotel lobby where it was created.) I can't find this list anywhere else, and it seems arbitrary to me.

One consideration is whether we can get this information from probing wav or mp3 files.

Mono/Stereo seems safe enough. Beyond that I don't know. I do suspect that if we include "5.1 Surround" we should also include lots of other versions of this and other configurations.

All input welcome!

rdb commented 4 years ago

For the record, 5.1 is not a version, but a speaker configuration (5 satellite speakers, 1 subwoofer). I think there is a fairly standard set of common speaker configurations in existence. Audio APIs could be a source of possible values supported by hardware/operating systems, for example: https://fmod.com/resources/documentation-api?version=1.10&page=content/generated/FMOD_SPEAKERMODE.html https://github.com/openalext/openalext/wiki/AL_EXT_MCFORMATS

mvahowe commented 4 years ago

@rdb Thanks, that's helpful. I'm still not sure how fluid those lists are, and, also, whether we have audio that supports any of these configurations, or even how we'd know if they did or not.

mvahowe commented 4 years ago

(In implementation terms it would be extremely helpful if whatever information we need for audio metadata could be discovered using well-known open-source tools such as ffmpeg or sox.)

FoolRunning commented 4 years ago

In implementation terms it would be extremely helpful if whatever information we need for audio metadata could be discovered using well-known open-source tools such as ffmpeg or sox.

Looks like this is easy with ffmpeg: https://superuser.com/questions/1106343/determine-video-bitrate-using-ffmpeg/1111039

mvahowe commented 4 years ago

@FoolRunning Yes, DBL's Nathanael already uses ffmpeg. My question is how much track information we can get from that. from the screenshots at your link the answer appears to be "numberOfChannels".

FoolRunning commented 4 years ago

Ah. Maybe this is better, then: https://stackoverflow.com/questions/47905083/how-to-check-number-of-channels-in-my-audio-wav-file-using-ffmpeg-command/47905308

I think the channel_layout is what you need. Documentation for the values is here.

mvahowe commented 4 years ago

Thanks @FoolRunning.

My proposal is

a tagged approach to dramatization, as described above
the rather long list of channel layouts from the last link above
call this field channelLayout

Can we talk about this?

LBCBoot commented 4 years ago

What I understood is one-channel (mono) or two-channels (stereo) only should make sense in dramatized audio (with music, voices and passing camels at Genesis 37), or audio with multiple speakers (when "Abraham" and "Sarah" are using two different microphones from a different position, or single voice with music (you can hear the voilin is more on the right and the speaker stays in front of it). This all gets lost in most of the daily use of audio on a smartphone.

Single voice (mono) is a big part of what we have now in DBL and we cannot trust if in DBL something is mentioned as "stereo", it is in reality multi-channeled. I think it should make sense to investigate some dramatized audio to check IF they have recorded it in stereo and HOW they did the recordings.

mvahowe commented 4 years ago

I think PR #177 implements most of this. I suggest that we leave the "configuration" fields as-is for now.

bible-technology / scripture-burrito

Enums for Audio-Related Metadata #144

Audio Dramatization

Audio Track Information