Closed ThreeDeeJay closed 3 months ago
Includes makemhr, sofa-info, alrecord, altonegen, openal-info and allafplay Windows executables in utils.zip
- Uses my libmysofa fork that fixed the
Invalid format
error by removing a filesize/density limit, so that makemhr is able to convert larger SOFA files (like EAC HRTFs with 5° elevations/azimuths) v1.3.2 (latest upstream) ... My fork (just a two-liner so let me know if you'd rather fork and remove limit yourself or if I should revert to the limited official branch) ...
It's understandable to want to use a version that doesn't arbitrarily limit the supported size, but that particular change seems questionable. It's removing checks for the sizes being negative, and with larger values, may need overflow checks in various other places. Improperly constructed SOFA files may be erroneously detected as valid, and cause misbehavior without overflow checks.
Which by the way, is playback supposed to work or is it still a WIP? When I try the LAF files I posted here, I just get a channel breakdown and it immediately exits the process without playing any audio 🤔
It should work, yes. Odd that it seems to fail reading the channel header data (it's all 0
s, when the E
values should be -0
or -nan
), despite correctly reading the header markers and the quality, mode, and track count values, which in turn causes it to fail to read the sample rate and sample count, aborting playback. I don't see why, the file is opened in binary mode and those files play for me; if there was a problem reading binary/non-text data through std::ifstream
on Windows, I'd have expected a similar problem with OpenAL Soft itself loading mhr files, which it obviously does without issue.
Unrelated, but it also seems like my trick of calling std::terminate
in a catch(...)
block doesn't seem to actually report anything on Windows? No dialog box about an abnormal termination, nor a message about why it terminated?
It's understandable to want to use a version that doesn't arbitrarily limit the supported size, but that particular change seems questionable. It's removing checks for the sizes being negative, and with larger values, may need overflow checks in various other places. Improperly constructed SOFA files may be erroneously detected as valid, and cause misbehavior without overflow checks.
Makes sense. I reverted it to the official repo though I updated it to v1.3.2 which has a slightly more generous limit. So what if we just keep the negative size check, but expand the limit to around 4GB? For reference a 1 degree azimuth/elevation EAC HRTF is >2GB so 96Khz might be double that, and I've seen other SOFAs that are GBs in size.
It should work, yes. Odd that it seems to fail reading the channel header data (it's all
0
s, when theE
values should be-0
or-nan
), despite correctly reading the header markers and the quality, mode, and track count values, which in turn causes it to fail to read the sample rate and sample count, aborting playback. I don't see why, the file is opened in binary mode and those files play for me; if there was a problem reading binary/non-text data throughstd::ifstream
on Windows, I'd have expected a similar problem with OpenAL Soft itself loading mhr files, which it obviously does without issue.
Looks like https://github.com/kcat/openal-soft/commit/7a7350af2b2cb9ad6103a43e7a932a4c45c31cb7 did the trick because this time I did get audio playback 👍 Dolby's Universe Fury demo had never sounded this good to me with my HRTF, except, that there are really loud static noises at seemingly random times, which I just noticed also happen in Cavern when playing the LAF, but when playing the original file directly, I instead get a suppressed pop/clip. So there might be a conversion bug or the Cavern decoder is yet to address some data getting corrupted 🤔
EDIT: Welp, apparently the static noise is a known issue with DD+ JOC https://github.com/VoidXH/Cavern/issues/17 But on the bright side, playing the LAF converted from ADM sounds pretty much flawless
Unrelated, but it also seems like my trick of calling
std::terminate
in acatch(...)
block doesn't seem to actually report anything on Windows? No dialog box about an abnormal termination, nor a message about why it terminated?
Nope, nada.
So what if we just keep the negative size check, but expand the limit to around 4GB? For reference a 1 degree azimuth/elevation EAC HRTF is >2GB so 96Khz might be double that, and I've seen other SOFAs that are GBs in size.
Looks like it may need an overflow check too. Something like
if(elements <= 0 || size <= 0 || INT_MAX/size > elements)
... error ...
I don't know how this check in specific relates to the file or dataset size, but at least the result of elements * size
can't go far above 2.1 billion, or else it'll overflow and become undefined behavior. Expanding that would need more work to avoid signed ints, where those kinds of changes tend to need farther reaching changes since those variables may interact with other variables that may not be happy when they're different types.
Looks like 7a7350a did the trick because this time I did get audio playback 👍
That's quite odd. It shouldn't have affected reading, basically just replacing terminating asserts with exceptions that skip just the current file.
EDIT: Welp, apparently the static noise is a known issue with DD+ JOC VoidXH/Cavern#17 So I wonder how this person managed to get a clean sound
This demo was converted from DD+ E-AC-3 JOC to LAF with Cavern and then rendered using OpenAL Soft's Default HRTF
I may not know what I'm talking about, or misunderstanding what I thought I read, but IIRC, support for pure DD+ JOC was a problem, while E-AC-3 JOC worked better. Or maybe that was TrueHD with Atmos that didn't work, while DD+/E-AC-3 with Atmos did.
On the bright side, playing the LAF converted from ADM (invite) works beautifully
That's the kind of thing I expected to see from LAF files converted from Atmos. Rather than the other LAF file that had bursts of static, which seems to just be a 7.1.4 mix (that doesn't even mark the LFE channel properly), that one there uses a fixed channel bed (with a properly marked LFE channel) and several dozen dynamic objects. It does still seem to use left-handed coordinates though (+Z is front), which isn't correct according to the LAF documentation.
It does make me think the LAF player should try to be more efficient with source playback, stopping sources that are disabled in chunk headers, and starting them when enabled with more audio. Doing that would make it more efficient when multiple channels are disabled/silent most of the time. It would require the AL_SOFT_source_start_delay
extension to make sure sources are restarted at the proper time synchronized with other sources (currently no extensions are required, outside of float32 and int32 for files that use float32 or int24 samples respectively), and if there's any resampling going on, synchronization will be extra tricky to account for sub-sample offsets.
Looks like it may need an overflow check too.
Done https://github.com/ThreeDeeJay/openal-soft/commit/f3d74878d2ac3e4e45b853569bec406dcb527bb3, though the =<5° EAC SOFA still fails to convert with the limit. Also, the CI now clones a specific commit, for stability reasons, and I added commit date, count and hash to the workflow artifact to avoid duplicate confusion when debugging
I may not know what I'm talking about, or misunderstanding what I thought I read, but IIRC, support for pure DD+ JOC was a problem, while E-AC-3 JOC worked better. Or maybe that was TrueHD with Atmos that didn't work, while DD+/E-AC-3 with Atmos did.
Yeah, what I do know is that TrueHD can't be implemented at the moment. So I've been looking into the process to at least mix to 9.1.6 so maybe I can convert it to ADM then into LAF, since source ADM tracks are extremely scarce. And apparently coarse JOC is what causes decoding artifacts:
There are two types of JOC coding, coarse and sparse. Coarse, which is widely used by movies, works without issues, sparse (found in some versions of amaze) doesn't, as the documentation is incorrect about it https://github.com/VoidXH/Cavern/issues/17#issuecomment-1268149518
Wait, sorry. That check should be
if(elements <= 0 || size <= 0 || elements > INT_MAX/size)
return MYSOFA_INVALID_FORMAT;
Thanks, that did the trick! 👌 By the way why does it sometimes use less than the total IRs? 🤔
By the way why does it sometimes use less than the total IRs? 🤔
It ignores IRs that don't align with the detected uniform layout. The IR layout for the mhr file needs to have fixed stepping along the elevation (starting at +90 degrees and stepping down to -90 degrees), and fixed stepping along the azimuth for each valid elevation (starting at 0 degrees, directly in front, and going around a full 360 degrees).
Nice 👍 By the way, you might wanna delete https://github.com/kcat/openal-soft/releases/tag/makemhr since that release is now obsolete.
I may not know what I'm talking about, or misunderstanding what I thought I read, but IIRC, support for pure DD+ JOC was a problem, while E-AC-3 JOC worked better. Or maybe that was TrueHD with Atmos that didn't work, while DD+/E-AC-3 with Atmos did.
Yeah, what I do know is that TrueHD can't be implemented at the moment. So I've been looking into the process to at least mix to 9.1.6 so maybe I can convert it to ADM then into LAF, since source ADM tracks are extremely scarce. And apparently coarse JOC is what causes decoding artifacts:
There are two types of JOC coding, coarse and sparse. Coarse, which is widely used by movies, works without issues, sparse (found in some versions of amaze) doesn't, as the documentation is incorrect about it VoidXH/Cavern#17 (comment)
I'm curious if ffmpeg's code could help at all. It seems to handle decoding TrueHD tracks just fine, "truehd (Dolby TrueHD + Dolby Atmos)" even. "eac3 (Dolby Digital Plus + Dolby Atmos)" also plays fine, without any glitching. Interestingly, it reports and provides such tracks as 7.1, which seem to have everything in the mix. I can't do a deeper look to see if it's 7.1(.4) premixed or something that it's giving with a simple downmix, or if it has dynamic objects being moved and mixed in real-time by its decoder.
But on the bright side, playing the LAF converted from ADM sounds pretty much flawless
Also interestingly, this LAF doesn't seem too well optimized. Each chunk always has all tracks enabled, when I'm sure many of them are silent for long stretches of time. I also wonder if there's room to optimize track utilization (if two or more tracks don't play sound at the same time, they can be on the same track; you'd only need as many tracks as are played simultaneously). Those would help cut down on the LAF file size pretty well, I think.
I'm curious if ffmpeg's code could help at all. It seems to handle decoding TrueHD tracks just fine, "truehd (Dolby TrueHD + Dolby Atmos)" even. "eac3 (Dolby Digital Plus + Dolby Atmos)" also plays fine, without any glitching. Interestingly, it reports and provides such tracks as 7.1, which seem to have everything in the mix. I can't do a deeper look to see if it's 7.1(.4) premixed or something that it's giving with a simple downmix, or if it has dynamic objects being moved and mixed in real-time by its decoder.
Last I checked, ffmpeg could just mux/identify the TrueHD Atmos track and decode the 7.1 "core" surround, but not heights or objects (which I think are encoded into the 7.1 track with metadata), though some figured out how to render up to 9.1.6 using a script or Music Media Helper, both of which rely on Dolby Reference Player/Encoder
According to a user in our server, DDPA was implementable in FOSS since there is an open spec (ETSI EAC object carriage). Decoding THDA is not implementable in FOSS since there is no open spec, unless you get the proprietary software DRP to do the job, which is what the script does. https://cavern.sbence.hu/cavern/doc.php?p=Cavernize#cd click on "Codec disclaimer"
Also interestingly, this LAF doesn't seem too well optimized. Each chunk always has all tracks enabled, when I'm sure many of them are silent for long stretches of time. I also wonder if there's room to optimize track utilization (if two or more tracks don't play sound at the same time, they can be on the same track; you'd only need as many tracks as are played simultaneously). Those would help cut down on the LAF file size pretty well, I think.
I forwarded it to the dev since tbh I'm not really familiar with the Cavern codebase (I'm just the CI guy there too lol), so feel free to join the conversation here, or via a new issue/PR 👀👌
Also interestingly, this LAF doesn't seem too well optimized. Each chunk always has all tracks enabled, when I'm sure many of them are silent for long stretches of time. I also wonder if there's room to optimize track utilization (if two or more tracks don't play sound at the same time, they can be on the same track; you'd only need as many tracks as are played simultaneously). Those would help cut down on the LAF file size pretty well, I think.
Thanks for the implementation! When a LAF is exported by converting an E-AC-3 file, even if noise is present, the track will be active. One track is one object from the source. You could optimize it by clustering or gating the PCM signals, but that wouldn't be lossless.
Invalid format
error by removing a filesize/density limit, so that makemhr is able to convert larger SOFA files (like EAC HRTFs with 5° elevations/azimuths) v1.3.2 (latest upstream) My fork (just a two-liner so let me know if you'd rather fork and remove limit yourself or if I should revert to the limited official branch)I've had this branch for a while and was reminded of it here and then I recently saw https://github.com/kcat/openal-soft/commit/3b10c6f9bc83e535f5de119f668ce6fd17e4e67d and wanted to try it out so I figured it'd be a good time to share this new workflow. Which by the way, is playback supposed to work or is it still a WIP? When I try the LAF files I posted here, I just get a channel breakdown and it immediately exits the process without playing any audio 🤔