reaper-oss / sws

The SWS extension is a collection of features that seamlessly integrate into REAPER, the Digital Audio Workstation (DAW) software by Cockos, Inc
MIT License
444 stars 85 forks source link

FR: Loudness scanning and normalization - new requirements #1046

Open AironAudio opened 5 years ago

AironAudio commented 5 years ago

Netflix has just changed its delivery requirements. We now need to use a Dolby dialogue-gated(which is now free code), 1770.1 version of the loudness measurement standard. That's the 1770 standard without a relative gate.

http://www.itu.int/dms_pubrec/itu-r/rec/bs/R-REC-BS.1770-0-200607-S!!PDF-E.pdf

The libebur128 library has been updated as well, and would need to be modified to bypass the relative and absolute gates.

Why Reaper ? Speed. I normalize the items of the 5.1 or 2.0 mixes in Reaper with the SWS extension and do the split-to-mono playouts that I need to pack up and send off. With other tools there's at least one more inbetween step, as none of them seem to be able to generate single-channel files. Thus, I'd rather use Reaper, even if I need to get another tool in the meantime. Reaper's just the better tool.

Currently the Loudness portion of the extension uses the BS-1770.3 or 4 method of measuring as well as the EBU R128 recommendation.

The whole thing is based on this article which links to the actual recommendations: https://www.pro-tools-expert.com/home-page/2018/9/18/netflix-respond-to-our-article-on-their-new-loudness-delivery-specs

Basically it's a dialogue gate whose output is scanned by the ITU 1770-1 method. We're using the pure 1770-3 method irrc, or perhaps even 1770.4.

The code for the Dolby dialogue gate is completely free. It's available on request via this link : https://www.dolby.com/us/en/technologies/speech-gating-reference-code.aspx

The Netflix recommendation is here: https://help.prodicle.com/hc/en-us/articles/360001759947-Netflix-Audio-Mix-Specifications-and-Best-Practices

Reaper's extension architecture just lends itself so well to offline scanning, that I'm hoping there's a programmer with a spare few hours to slot in an extra mode in to the existing Loudness portion of the SWS extension.

I've got no C++ IDE experience at all, or I'd try to do this. I'd donate a good chunk to get this in to the SWS extension. The tools to do it more slowly than in Reaper cost at least $400, so I'd be perfectly willing to drop at least $200 in donations.

AironAudio commented 5 years ago

After filling out the form on the Dolby site, you can download the code right away. Nobody sits there approving any organization or person.

Checking the "Dolby Dialogue Intelligence" code, it's very well documented pure ANSI C. It converts the input to 16kHz, runs it through a bunch of detectors and combines the results to get a value on whether speech is there or not. That probably has to be used to control a gate or volume parameter across the sections.

Still studying the stuff.

nofishonfriday commented 5 years ago

Just took a look at the Dolby Dialogue Intelligence source files, it says:

This program is protected under international and U.S. copyright laws as an unpublished work. This program is confidential and proprietary to Dolby Laboratories. Reproduction or disclosure, in whole or in part, or the production of derivative works therefrom without the express permission of Dolby Laboratories is prohibited.

Someone may correct me, but I don't think it's possible to incorporate this in SWS, being open source / MIT licensed, is it ?

AironAudio commented 5 years ago

That is correct. For that reason I may need to make an extension all by itself for which the source code can only be made available who have gotten the source from Dolby itself, which is free and only requires filling out that form.

I'm investigating how to do that. I've got Visual Studio installed, got some extension code form Cfillion to start with and the all the other source code, but to be honest, that could take a very long time, because my experience level with C++ is almost zero.

cfillion commented 5 years ago

The foreign (dolby's) source code does not need to be integrated into the open-source codebase. Each interested developer could download their own copy and have the open-source part pick it up at build time (this could somewhat complicate things for automated CI builds which seems to be on the table however.)

Another possibility is to make the dolby code available to SWS's existing loudness code via a DLL (assuming supporting that and 1770.1 doesn't require rewriting the whole thing... in which case re-inventing the wheel from scratch in a new extension might be easier).

AironAudio commented 5 years ago

The DLL approach seems feasible. Thank you. If I can figure out how to make a DLL of that ANSI C code, perhaps someone at SWS can create a modified copy of the existing loudness code from the libebur128 library. The Dolby code is supposed to act as an on/off gate only.

Scan sample block with dialogue gate, gate sample block, scan with 1770-1 loudness scan. That 1770-1 version is just the same without the relative gate in the 1770-3/4 version.

Learning C++ atm, so I don't know how long this might take. Working a lot at the same time too. I'll be asking questions :) .

nofishonfriday commented 5 years ago

@AironAudio I'm currently in the process of updating the in SWS used libebur128 library to latest version (v1.2.4), something I had in mind to do for longer anyway. First runs with the EBU provided testfiles look ok so far. Here's a SWS test version with the updated library: https://www.dropbox.com/s/kv2383b2tawjiai/reaper_sws64.dll?dl=1

Branch is here: https://github.com/nofishonfriday/sws/tree/libebur128_-_update_to_v1.2.4

Would be nice if you (or anyone interested) could give it a test drive and compare results with other Loudness scanners if you have a chance. I only have the free r128gain available. Results also seem ok there so far, but it doesn't show short term / momentary max.

Regarding implementing the Netflix loudness scanning in SWS: An idea would be doing a FR issue for implementing functions to bypass the gating at https://github.com/jiixyj/libebur128 directly, maybe they'll also find it useful.

This would save us from doing the mod ourselves

AironAudio commented 5 years ago

Finally found time to do a couple of scans. I'm currently evaluating another application, Nugen LM Correct.

The 1770-4 scans from that matches the updated library you posted for integrated loudness. The maximum short term (3 second loudness) and maximum momentary does vary a bit. LM Correct gets higher values quite consistently, so they must be measuring stuff a little differently.

The files that have to check out are the test files. I got them here: https://tech.ebu.ch/publications/ebu_loudness_test_set

And here's a spread sheet of exported stuff from the Loudness window for all those files. https://www.dropbox.com/s/b548kfxwolwy1cu/SWS%20-%20libebur128%20v142%20-%201770-4%20scans.ods?dl=0

What these values need to be is detailed here: https://tech.ebu.ch/docs/tech/tech3341.pdf on page 10+

and here https://tech.ebu.ch/docs/tech/tech3342.pdf

They all check out for "file based" scanners as far as I can tell, except for the seq-3341-13-*.wav files. Those are the only ones that do not check out, so there's something up with the momentary calculations. Only the first file, seq-3341-13-1-24bit.wav check out at -23 LUFS maximum momentary loudness.

This is something we need to post for the libebur128 guys too.

Other than that, it's all great. Integrated loudness over entire mixes all check out file, as do all the true-peak measurements, which I never use however, since my oversampled limiter takes care of those.

nofishonfriday commented 5 years ago

That's helpful, thanks.

I just did a loudness scan with the EBU testfiles with the libebur v1.0.1 that's used in SWS currently and I get the exact same results (also short term / momentary max.) as with my updated v1.2.4 version, so I'm not even sure if it's even worth updating the lib (seems none of the fixes affect us). The lib is quite modified for use in SWS so the updating was not that straight forward and may have introduced new issues during my update process. I'll probably submit it to SWS as optional pull request requesting a review.

nofishonfriday commented 5 years ago

Posted at libebur128 about the seq-3341-13-* files, seems the issue is on our end. https://github.com/jiixyj/libebur128/issues/93

Will try to look into it.

audionuma commented 5 years ago

Concerning max momentary issue, see my comment on the libebur128 closed issue: https://github.com/jiixyj/libebur128/issues/93#issuecomment-429043548

nofishonfriday commented 5 years ago

Found the culprit, thanks to @audionuma's hint.

loudn2

BeyondVertical commented 5 years ago

@nofishonfriday: When can we use it in SWS? I had also noticed this when testing with the EBU files.

AironAudio commented 5 years ago

Nice one. With that we have 1770-4 compliance.

Next thing on my bucket list is trying to make a DLL out of the Dolby Dialogue Gate code. I've got most of the stuff installed and tutorials at hand, because I realize it's an extremely niche thing I need.

I'm using the Nugen LM Correct tool atm, and it's a bit slow. You can't turn off truepeak scanning, and I only analyze files and do all the gain changes in Reaper anyway, because that's where I render to multiple mono files with the limiter of my choice, and across as many episodes as I need. The Nugen tool is one-at-a-time and has no drag'n'drop. No commandline version either.

That's my motivation to get the Dolby Dialogue Intelligence Gate in to the SWS extension. I'm hoping it won't take terribly long. Breeder is terribly busy.

I will ask questions once I get that library compiled somehow. Maybe even create a little extension. Who knows.

nofishonfriday commented 5 years ago

I've added a new 'high precision' analyzing mode which should give correct results with the seq-3341-13-* files. Can be set in 'SWS/BR: Analyze loudness...' window -> Options' or via new action 'SWS/BR/NF: Toggle use high precision mode for loudness analyzing'.

Test version here (Win x64) Branch here

AironAudio commented 5 years ago

Nice. Does that mean it may be easier adding other scan modes in the future as well ? I'll give the build a test run here too. I'm finishing up a bunch of mixes in the coming days and will try to get some coding done in the week or so of free time.

nofishonfriday commented 5 years ago

Thanks if you get a chance for testing.

Does that mean it may be easier adding other scan modes in the future as well ?

To a degree yes, as the code foundation for choosing different scanning modes is now there. Of course the much bigger task would still be actually implementing the other scanning modes (the 'high precision' scanning mode is a relatively minor change to the default scanning mode.)

Regarding the netflix scanning mode, as suggested above, you could do a feature request at libebur128 directly for optionally disabling the gating, I mean can't hurt me thinks. If they approve that would save us doing this part (modifying the library) ourselves at least.

AironAudio commented 5 years ago

Done. https://github.com/jiixyj/libebur128/issues/94

nofishonfriday commented 5 years ago

Did some work on it recently. I built a static library (.lib) from the Dolby source which SWS can be linked against when building. The dialogue detection / gating already basically works when doing Loudness scanning.

@cfillion (or anyone):

Another possibility is to make the dolby code available to SWS's existing loudness code via a DLL

So do I have it right that providing the Dialogue Intelligence lib's (32 and 64 bit) and the corresponding include (header) files to the SWS repo should be ok? Best would probably be to get the approval (or not) from @swstim, because in negative case I wouldn't continue working on this.

cfillion commented 5 years ago

Adding compiled binaries (like .lib) to a source code repository isn't good practice (I know we already have pre-compiled taglib hosted here, ugh...). Ideally dependencies should be built along or before SWS.

I don't think Dolby's license terms (https://github.com/reaper-oss/sws/issues/1046#issuecomment-424701669) allows us to copy their headers either... I think it should work like we do with WDL: downloaded separately.

I suggested making it a DLL because I don't know if we can safely build it straight into SWS. The DLL is no better, but at least it wouldn't be part of the SWS DLL itself and could be made as an optional non-open source addon.

nofishonfriday commented 5 years ago

Thanks. I got the idea doing a static library when I saw taglib hosted here indeed. But if you say it isn't good practice, ok. To have the Dolby source code downloaded separately doesn't seem very feasable to me either as it's not as straigh forward as with WDL, just cloning a repo vs. signing an agreement which maybe not everyone wants to do working on SWS, then probably SWS build cases would have to be made (is the Dolby code present or not) and so on that I think it's too much hassle continuing working on this (what concerns me at least.) Pitty...

nofishonfriday commented 5 years ago

I'm too motivated to get this in SWS to give up just yet. :) It can now be built along with building SWS directly from the Dolby source (no pre-compiled .lib or .dll) by putting the source in a dedicated SWS directory (not under version control) and I've used #ifdef's so SWS can be built with or without the Dialogue scanner, depending on if the source is present or not.

@swstim / @cfillion Does this sound like a viable approach?

Sneak peak: The test file that's included with the Dolby source (containing dialogue and music), scanned with VisLM ('Netflix' preset) and the 'SWS Dialogue loudness scanner".

gif

AironAudio commented 5 years ago

Good grief, you got it going already?! What an accomplishment. I'm still learning C++ over here in the dunce corner.

Nofishonfriday , what test files did you use. I can use the NuGen LM Correct software to verify more test files if you like. I've got plenty of test files.

Since the license in the SWS source code is likely not compatible with the Dolby license, would a separate DLL download be a whole lot more work ? The SWS site, or Github could host it, but the source would need to stay private I will presume.

Maybe some folks could rewrite all parts of it to make it their own and put that out as open source. That could be a good challenge down the road. The library is useful for speech detection. To that regard, providing functions to Reaper scripting users could prove beneficial. Who knows what people could put together with reliable speech detection. It could help to develop an exceptional, almost level-independant noise gate for noisy speech sources for example.

Also, other DSP-savy folks could optimize this in a good way too. The source documentation already mentions spots for optimization.

nofishonfriday commented 5 years ago

@AironAudio If you're interested in testing, I can PM you a test version soon(ish), so we can see if it's even worth continuing this project. First test results look quite ok here, but I have limited ('real life') test material.

AironAudio commented 5 years ago

I've got plenty. I'll gladly give it a run with a few dozen mixes and compare results to the NuGen tool.

For final distribution there seem to be at least two solutions. A binary-only download, which is perfectly legal if there's no code in the binary that requires making the source code available. Personally that would be my least favorite choice, since our community thrives on access to new tools, and that would deny many that access.

The second idea already expressed it to have just the ddi(dolby dialogue intellience) library as a binary-only download, source available on request on a private Github. The SWS extension would check for its existence and offer certain functions if that is available. That's probably a lot more work though.

But we could ask for help from other community members in providing such features for other occasions as well. For example, SWS could provide a frontend to the FFMPEG or Handbrake encoder to provide multi-core video rendering to other format. There too, the commandline executables would need to be available and thus functions be provided to the user.

Then again, maybe those kinds of features(checking & have functions available or work/not-work are already in the extension, and they can be used or copied for use with the ddi library.

So, when you'd like some testing done, I'll run a huge stack of mixes through both the extension and the Nugen tool and compare results. Please keep in mind that we don't have to be identical. Your GIF promises that all is already just fine. We have a +/- 3dB playground in the spec. Actually I'm starting to think this is a better spec for spoken word programs. It works rather well so far.

maximehyh commented 5 years ago

Hi @AironAudio, hi @nofishonfriday,

First, thanks a lot for opening this issue as I was starting to feel lonely looking for a way to implement Dolby's algorithm.

We currently do it using Dolby Media Meter but I am looking for a way of doing the exact same measurement using Python and potentially packaging the whole thing in a docker container.

@nofishonfriday your results shown in the gif look excellent and I was wondering if you could consider PM the backend code (I have some limited knowledge in C++) thats makes the loudness measurement using the speech gating algorithm? I did read Dolby's documentation but that could help me a lot in order to better understand the processing workflow (K filtering, means...) and when/where i need to add the Dolby processing.

I would also be interested in contributing to a broader accessibility of Dolby's algorithm within loudness measurement.

Thanks!

nofishonfriday commented 5 years ago

@maximehyh

Hi, for the loudness measurement we use libebur128 library which takes care of the K filtering and all the other Loudness measurement stuff. Note that because libebur128 uses BS-1770.4 standard which applies gating it must be modified (remove the gating) to work with measuring Dialogue Loudness (see @AironAudio's first comment here).

My current Dialogue Loudness scanning implementation in sws extensions you can find here (NF_DialogueLoudness.h/.cpp), specifically see AnalyzeData() where the actual speech gating using Dialogue Intelligence and following Loudness measuremnt using libebur128 is done (I tried to comment as good as I could :)). Note it's work in progress currently (but it seems to work quite ok already, we did some more tests meanwhile).

Hope this helps.

maximehyh commented 5 years ago

Excellent! Thanks a lot for the links, I'll start diving into it right away.

One quick question: I would like to try something like AnalyzeData("PATH_TO_MY_FILE") and I am wondering what exactly is a MediaTrack object? I guess it is related to sws?

nofishonfriday commented 5 years ago

what exactly is a MediaTrack object?

sws is an extension for the REAPER digital audio workstation (as you may know), which has a quite extensive API. A MediaTrack is a pointer to a media track in a REAPER project which can be retrieved using the API function GetTrack().

maximehyh commented 5 years ago

Alright thanks!

CraigWatt commented 1 year ago

@maximehyh

Hi, for the loudness measurement we use libebur128 library which takes care of the K filtering and all the other Loudness measurement stuff. Note that because libebur128 uses BS-1770.4 standard which applies gating it must be modified (remove the gating) to work with measuring Dialogue Loudness (see @AironAudio's first comment here).

My current Dialogue Loudness scanning implementation in sws extensions you can find here (NF_DialogueLoudness.h/.cpp), specifically see AnalyzeData() where the actual speech gating using Dialogue Intelligence and following Loudness measuremnt using libebur128 is done (I tried to comment as good as I could :)). Note it's work in progress currently (but it seems to work quite ok already, we did some more tests meanwhile).

Hope this helps.

Hi @nofishonfriday , I wanted to ask if this was every finalised? Is it now part of SWS officially?

// Dialogue Intelligence library + functions pointers
static HINSTANCE g_libdi = NULL;

Does this solution require the Dialogue Intelligence source still? I suppose there really is zero open source alternative to this is there?

From my understanding, any tool that can detect Dialog LRA uses a Dialogue Anchor alongside ebur128, but in order to ever detect speech audio, Dolby Dialogue Intelligence must be used? There is no viable alternative?

nofishonfriday commented 1 year ago

@CraigWatt Unfortunately it was never finalised (never part of official SWS) and is abandoned from my side. :/ Part of it is because of Dolby's proprietary license and its incompatibilty with SWS' MIT license, as you noted. Then another part is, the form of implementation I had was not very good (I copied the BR_Loudness.h/.cpp files and started from there, just as a proof of concept, while ideally it should have been integrated in the existing files to avoid a lot of code duplication).

Does this solution require the Dialogue Intelligence source still?

The current code is based on loading pre-built shared libs.

Sorry for not having a more positive answer. That said, my WIP code is still there (as you found), so feel free to use/mod it in any way you like, if you're motivated. :)

michellewen1516 commented 1 year ago

With the 1770.1 version of the loudness measurement standard. That's the 1770 standard without a relative gate in the above, but fro 1770-2/3/4 standard, said:

" two thresholds are used: 1. –70 LKFS; -10DB relative gate" May I ask why didn't mention the -70 LKFS as the difference?