olive-editor / olive

Free open-source non-linear video editor
https://olivevideoeditor.org/
GNU General Public License v3.0
8.27k stars 558 forks source link

[EDIT] Audio normalization #262

Open prokoudine opened 5 years ago

prokoudine commented 5 years ago

Audio recorded with Olive might need to be normalized. This could be an audio effect that could sit in the effects stack, be available via a shortcut (if possible), and update the clip's waveform in the timeline (if possible).

itsmattkc commented 5 years ago

I assume this refers to some kind of dynamic range compression? I've thought about adding a couple of basic audio effects (aside from volume/pan), but I think the real advantage in this vein would be implementing #14

prokoudine commented 5 years ago

Sort of :)

See e.g. https://en.wikipedia.org/wiki/Audio_normalization

prokoudine commented 5 years ago

Also, #14 only mentions VST/AU, but on Linux, this would be native VST2 (LXVST), LADSPA, and LV2.

Even so, there are some basic things that it would be great to have without installing all sorts of extras: basic equalizer, compressor, normalization effect, reverb. Even Ardour ended up shipping half a dozen of its own LV2 effects so that the application would be usable on any platform immediately.

itsmattkc commented 5 years ago

Yeah it would be nice to have a handful of basic audio effects out of the box. I'll see about adding some in within the next few weeks.

ghost commented 5 years ago

Yeah it would be nice to have a handful of basic audio effects out of the box.

Why not just "fork" some basic audio effects from Audacity source code? License is compatible

pic.1

P.S.: Audacity AppImage:

prokoudine commented 5 years ago

Depends on what you mean by "forking". Audacity's FX design is destructive. Olive has to be non-destructive.

itsmattkc commented 5 years ago

Yes it will probably be something like that, some sort of existing MIT or GPL licensed code that I can work into Olive. Audacity may be a good place to start.

alcomposer commented 5 years ago

Obviously using VST or LV2 would be a good idea for audio effects. This will allow any existing effects (some GPL) to be utilized.

main-gi commented 5 years ago

Also try to not have VstPlugins forced folder nonsense :P

As a composer and video editor myself,

Sony Vegas and FL Studio being on the same system (well, not the program itself, but the fact that you get VST's with it) makes Sony Vegas love to crash when reading certain VSTs.

brontosaurusrex commented 5 years ago

Loudness normalization, Ebu r128 'Integrated loudness I' per timeline clip would be very interesting (Interview cut for example), this is not dynamic range type of thing btw.

ffmpeg example ffmpeg -nostats -i clip.mp4 -vn -filter_complex ebur128 -f null - 2>&1 | grep I: | tail -n 1

I don't think any video editor has anything like that integrated right now (They have same filter as export option). p.s. I don't know how well the algorithm handles very short sections, but I imagine 'Integrated loudness' shall work as expected.

Edit: Example script that will echo out the correction needed if -17 dB is the goal (I wanted 6dB higher than default for web usage).

frink commented 5 years ago

Better not to try to do everything audio inside of Olive itself. If we're really looking to the pro-grade users there is always a separate workflow for audio... Foley alone could produce 50+ tracks for a single scene. The thing we should really be thinking about is interoperability with other programs that do other things well. Olive isn't a compositor or a title editor or a script management suite or a digital audio workstation...

The best open source cross platform audio workstations are Ardour, LMMS, and Audacity. (Non DAW might compile on Windows and OSX but I don't know of any success stories...) It would make more sense to allow export and import of audio to these amazing pieces of software which have MUCH better audio editing facilities. Ardour is now used in several of the top studios in the world.

Better to work within the cosmos of open source software than try to reinvent everything in a vacuum and become like every other bloated NLE on the market. (open source or proprietary...)

prokoudine commented 5 years ago

@frink,

LMMS doesn't do audio at all, so I'm not sure why you are bringing it up.

Audacity doesn't do non-destructive effects, so it's out of the picture as well.

That leaves us Ardour which is amazing software (I know I'm biased), but is kinda overwhelming, especially for trivial use cases such as the one discussed here. Besides, Olive doesn't support JACK transport to make the most of such a cooperation.

frink commented 5 years ago

LMMS does do audio. (As seen on their homepage track labeled "Vocals") However, Ardour is where I'm biased as well. I could have mentioned another 20x of sub-par FOSS DAWs but really the only two that are prograde are Ardour and Non.

Normalizing is not a huge piece and could probably be done inside the NLE. But if Olive is really geared towards the true pro-grade workflow then audio needs to be treated as its own entity. You either add your own makeshift DAW inside the NLE like Davinci or HitFilms or your separate it out as a separate tool like Adobe or Avid.

Video editors in the open source world tend to build monoliths to try to do everything. While Blender and Kdenlive are two of the best FOSS tools that we have for video to date both of them try to do too much inside themselves. It's better to have programs that work nice with others... (Like Openshot and Shotcut opening SVG in Inkscape for example...)

While I'm not opposed to the Davinci/HitFilms approach, I've just not seen video developers think clearly about the needs of audio editors... At a minimum we need to have a basic DAW with unlimited tracks, VST support, 3D panning, muting/soloing and bus routing facilities to even consider it as a solution for serious projects.(I believe Davinci 15 has everything mentioned except bus routing - not sure about HitFilms - The newest Lightworks also builds in this direction...)

My point was less about adding normalization and more about distinguishing between pro-grade use cases and hobby and prosumer use cases. The pros typically don't expect the NLE to normalize their audio unless they are simply wedding videographers. Anything more and they are using a DAW...

brontosaurusrex commented 5 years ago

@frink, Audio normalization can be used for editing purposes only and audio could then be exported for audio pros with that/or any filter turned off.

What is the equivalent of 'export to omf' in open source world?

frink commented 5 years ago

@brontosaurusrex - I don't know of OMF support in open source projects yet. Really the whole idea of interchangeability is sadly neglected by most projects. OpenShot and Blender are the only tools I've seen that open external editors for things. Ardour can do this for audio...

In theory, Ardour is setup perfectly for ADR with XJadeo, but it's a pain to get everything setup and working between Ardour and a video editor. OMF would certainly be nice. The Natron project could benefit from some sort of interchange between Compositor and NLE.

As you pointed out, we need better interchangeable formats. But that requires a lot of collaboration and co opted development between projects. Setting up the meetings necessary to kick this off is something difficult to do in the open source world on a shoestring budget. But I think there are enough mature projects that you might be able to bring several projects to the table.

Imagine Olive, Inkscape, Krita, Kdenlive, Gimp, Darktable, Flowblade, Ardour, Blender, Natron, etc... all working seamlessly together on one project with interactivity between them on par with Adobe Suite. No reason it couldn't happen. But you really need "one ring to bring them and bind them..." Could this "one ring" be Olive? ...Dunno yet.

prokoudine commented 5 years ago

Ardour devs have a very strong opinion about OMF. I believe the exact word was "godawful" :) Maybe @x42 could clarify on that.

For video, there is OpenTimelineIO, about to be used by Pitivi.

frink commented 5 years ago

I love @x42. He is almost as opinionated as Linus! But like Linus he is usually right. I've not taken a look at OMF but I'd venture to say that if it's an interchange format from commercial interests who don't audit each other's code then it will leave something to be desired in the open source world.

An open timeline format could work. in the end, I don't think Olive has the political clout to solve this issue.

See: https://xkcd.com/927/

brontosaurusrex commented 5 years ago

Just to clarify: I was thinking about: How to get rid of make audio people happy and making some 'simple' but useful tools inside olive at the same time, nothing omf specific (just an example).

edit: Not sure about the copyrights/licensing, but AAF looks a lot more interesting and specs (huge pdf) seems to be known. Another option might be some sort of xml.

frink commented 5 years ago

@brontosaurusrex - I understand the need of temporary audio normalization for editing dialog in particular may be necessary to get to the point of ADR. Now that I see your use case, I agree it makes sense.

Ideally, an adaptive compressor/limiter would be your best tool here if the user didn't have to tweak the settings and could simply flip a switch. I'd call it, "adaptive audio sweetening..." or something else sophisticated that sounds like EIS for audio.

x42 commented 5 years ago

Regarding AAF (and OMF) there's a pertinent rant at https://discourse.ardour.org/t/aaf/87442 There is more than just the object spec: https://github.com/nevali/aaf/tree/master/doc

Realistically a relatively simple EDL format such as CMX3600 may be more appropriate. The downside is that i doesn't include media. But I expect a flat zip with media-files and a basic text EDL will cover 95% or all cases and also work with a wide variety of NLE and DAWs.

As for the original issue that's discussed here, it seems you're looking for an "auto-mixer" or "leveler", a slow compressor for voice, vocals or dialog.

alcomposer commented 5 years ago

@x42 It would be great if you could share your thoughts re: OpenTimelineIO #310

frink commented 5 years ago

@unfa should also get in on this... He has a very different view from @x42. But he's one of the few that is consistently making tutorials for Ardour and other Linux open source audio tools and is generally well informed on usability concerns... Kinda like what @EposVox does with OBS Studio. (Another great guy to include in this discussion who has made some very well thought out rants on why he's not interested in Linux as a video production platform...)

Very deep thinkers all!!!

x42 commented 5 years ago

On 3/15/19 7:15 AM, alcomposer wrote:

@x42 It would be great if you could share your thoughts re: OpenTimelineIO #310

Thanks for reminding me. I should have mentioned OpenTimelineIO instead-of, or at least in addition to CMX3600.

It is a very well done EDL description format, where the information is nicely structured. It also only includes essential information and mostly leaves it at that. I very much like that time is represented as integer ratios (no floating point rounding issues). The JSON encoding is icing on top.

frink commented 5 years ago

@x42 - IS this something you think Paul would be interested in? Some sort of interchange between audio and video editors... A way to share projects?

frink commented 5 years ago

PS - we've got several tickets that are all discussing OpenTimelineIO we should probably link those discussion to a central ticket and make sure we only have this conversation in one place.

x42 commented 5 years ago

On 3/17/19 10:14 PM, frink wrote:

@x42 - IS this something you think Paul would be interested in? Some sort of interchange between audio and video editors... A way to share projects?

Yes, except the problem is that Ardour likely won't ever do video composition, not even basic video edits.

But importing corresponding audio that follows the video-cuts is something that we are interested in.

So far Ardour gets away with this by using BWAV, timestamp in the audio-file. Another current option is to import multi-channel audio where one channel has a LTC signal . Ardour can line this up.

In general this would be one-way. After the video-edit is done, apply the same edits to the audio in the DAW and then move on. That'll help a lot with dialog and ambiance recorded on set.

A /classic/ film workflow: finish video-editing and only then begin work on the soundtrack. Mix and master audio while the video guys do color-correction.

Allowing for forth and back while editing is not something that we're interested in, at least not at this point in time.

I'll have to check OpenTimelineIO again if it provides unique IDs for each region/edit that a DAW or NLE can use to match its internal regions or assets. That way there might be a possibility of incremental updates.

alcomposer commented 5 years ago

@x42 It is my understanding that OpenTimelineIO is intended to make interchange work. So yes it should allow for at very least incremental updates.

frink commented 5 years ago

@x42 - I agree that we have to at least get to the classic workflow you describe before we can do the back and forth. But I think we do need to shoot for that direction eventually. I've seen several hybrid approaches in video editing much more often than the classic workflow you described.

In music video it's the other way round completely. You do the audio first and then the video. Buggles prophecy has come true concerning video killing radio stars... The production of the music video is now just as important as the audio for most commercially viable music. Here I think the NLE needs to step up adding beat detection to tempo-lock edits against the music timeline. (Does any NLE do this yet?) An interchange format here would need to export not only the audio but hopefully the tempo map as well. This may be a simple place to start as it is unidirectional and a unique use case in its own right.

In documentary style editing it often goes back and forth. Having basic audio timing facilities in the NLE make a lot of sense here. Also, that's where "normalization" really comes in because editors need to be able to hear everything at relatively the same volume. This is probably the closest to the classic workflow in that the edits are finished first. In fact, most documentaries do the bulk of the audio work inside the NLE these days.

In feature films (especially those with tons of special effects) they break the project into scenes and finally stitch it all together after all the compositing and foley work is done. Each scene project may go through several iterations bouncing back and forth between between sound design, ADR, music sync, coloring, compositing and sometimes resequencing the scene entirely. This is where the dynamic bilateral format is so important.

Across the board we see the modern workflow beginning to focus on fluidity rather than standard conveyor belt production models. WIth the advent of digital composition, directors often demand flexibility from composer allowing for resequencing right up until the final mixdown and mastering. Maddening for music people - but this is the modern world...

Really you can think of the new film workflow as being most similar to Agile Scrum in the software development world. There are small goals and short deadlines for a specific piece of video and then we move on to the next set of tasks until it is complete.

All that to say, the modern workflow has a lot more expectation of give and take between the DAW and the NLE. Therefore, we need tighter integration of interchange formats between DAW and NLE to allow for changes that occur and bilaterally transfer edits in a decoupled way - often with complex collaborative approval workflows that defy hierarchy. Quite a lot to consider. But that's the modern demand of coupled video and audio production.

alcomposer commented 5 years ago

Yes, except the problem is that Ardour likely won't ever do video composition, not even basic video edits.

@x42 Yes, only allow Ardour to load the audio (and video track/s somehow?) with no video editing at all. Only use Ardour for DAW.

frink commented 5 years ago

@alcomposer see https://github.com/x42/xjadeo http://manual.ardour.org/video-timeline/ - mostly @x42's work - great job too!

That's why I'm addressing @x42 directly here. He is the reason that Ardour has a video timeline at all.

It will always be a singular timeline WITHOUT audio. The purpose is to give audio video sync subframe accuracy for doing video sync work for both composition, foley and ADR. Granted the Foley and ADR workflow still need a bit of polish... Works great for composition!

Thanks everyone for contributing to our collective understanding on the interactivity between DAW and NLE...

frink commented 5 years ago

This might be the right solution here: https://github.com/irungentoo/filter_audio

It's a fairly simple way to get quality audio for WebRTC. But this is the same type of things we're talking about when we look for good sounding dialog file. I may take this and turn it into a VST when I have some free time.

It's probably better than straight compression.

musaire commented 5 years ago

Compression and EQ, pan and gain are important! :) :+1: For the last two, also the envelopes/keyframe use is quite essential.

On the other hand, normalization is not that important because it should mean (by its definition, please don't mess this up in UI) evening out gain between clips or bringing the clip's gain level to target value. This can be done manually in some extent. Normalization is for rough audio work only or to set the initial, rough, gain levels - we always go over manually anyway for more precise work.

It should be possible to apply normalization to the entire track. I guess we can also select multiples clips :) - not sure how effects can be applied in Olive atm. Do we have clip and track level and/or project level effects.

musaire commented 5 years ago

And, yes, I guess, I'd use VST for as many effects as I could. Compression can be very complex, with lots of parameters. Also, pros look for good sounding EQ if there is music involved. Gain and pan will suffice! :) Some simple EQ and compression tools is useful (but not a priority if VSTs present) for every speech in every movie or documentary. VST support is (was) the priority. :) But OpenFX even more - I can't do without the FilmConvert plugin.

sobotka commented 5 years ago

But OpenFX even more - I can't do without the FilmConvert plugin.

I have some bad news for you then...

musaire commented 5 years ago

I have some bad news for you then...

:( Pardon my ignorance, OFX never planned or just too far ahead? Then I have to recall what other possibilities there were besides OFX for using FilmConvert. I guess Olive needs to scan their own films in, then. :D Or something similar what DxO Optics did for film emulation, can't remember whether they scanned as well. For sure we need grain effect that acts true to the real film (grain attributes based on luminance, chroma, and specific size film used). Grain matte can be composited though, not that convenient and not chroma-dependent out of the box.

frink commented 5 years ago

@sobotka What are you saying about OpenFX?

sobotka commented 5 years ago

OFX is unmanaged, which means it currently is entirely incompatible with the current overarching design direction. It is also means OFX plugins are of questionable merit.

frink commented 5 years ago

I see. I thought the Foundry used to manage it. Guess I've not kept up very well.

Devil's Advocate: There are a ton of OpenFX plugins that exist. OpenFX is supported by a large variety of editors that may be the best we got. A standard doesn't have to be maintained once published.

Angel's Advocate: Something abandoned is probably not worth including as a standard in a new editor. Most of the functionality of these plugins is supported internally by modern editors. Using a plugin system effectively requires a node composite system.

I personally don't care how we go on this. Natron supports OpenFX if you want to go that direction...