madshi commented 9 years ago

Testing your decoder, I've started with simple channel order tests, using conventional DTS files (no HD), testing all possible speaker/channel configurations: All tests passed with flying colors.

Then I've run XLL decoding tests, using my XLL sample collection with various format combinations. I'd say about 60-70% of them decoded identical to the ArcSoft DTS Decoder. However, several of them failed to decode properly. I've identified 5 different problems:

1) dcadec.exe reports "Failed to decode block code + invalid bitstream format". 2) dcadec.exe reports "PCM output parameters changed". libdcadec.dll's dcadec_context_filter() function suddenly changes bits_per_sample from 16 to 14 with one of the samples, when this happens. And decoded data is not correct, anymore. 3) dcadec.exe reports "PCM output overflow". 4) Decoding runs through without any errors, but the final result doesn't match the ArcSoft decoder. 5) Everything is fine, but dcadec.exe stops at "Progress: -95%". Just a cosmetical issue.

I'm uploading samples right now. Contained are the original DTS files and ArcSoft decoded WAV files, so you have a reference to compare to. Upload will take about 1,5 hours, so wait a little before you download:

http://madshi.net/dcaDecBugs.zip (413MB)

BTW, does the XLL extension have some sort of CRC check so that dcadec can check if the final decoded result is really bit perfect? I know that TrueHD has such a CRC. Not sure about XLL, though. I could imagine that XLL itself can be checked, but maybe not the final core+XLL combined data? Having some way to check that for lossless-ness would be very helpful, of course, so that users can know if lossless decoding succeeded or not. This might not be crucial for realtime playback, but for reencoding purposes (e.g. conversion to FLAC), having some sort of indication whether decoding was perfectly lossless or not would be very helpful.

Great job, in any case! The dcadec interface is easy to use, and decoding generally looks promising.

Hmmmm... One question: The "samples" buffer always seems to be 32bit? For lossless decoding I suppose I can simply extract the lower 16/24 bit and ignore the other 8/16 bit, correct? That's what I'm currently doing and it seems to match what dcadec.exe produces. However, what should I do for lossy decoding? Is the full 32bit data always filled? In that case I should probably always treat the dcadec lossy decoding results as 32bit PCM, so that eac3to later dithers the data down to the final output bitdepth, correct?

Nevcairiel commented 9 years ago

Lossy decoding is always 24-bit int, the decode function will tell you the bits per sample no matter what decode operation is used.

madshi commented 9 years ago

I think I've seen patterns like FF FF FF FF 00 00 00 00 in the decoded PCM data. But I could be wrong. A pattern like that would suggest 32bit int data, wouldn't it?

Hmmmm... I'm wondering what the following flag does?

/* Force bit exact DTS core decoding /

define DCADEC_FLAG_CORE_BIT_EXACT 0x02

What is the advantage/disadvantage of this flag? Is it better quality with it on/off? Is it slower with it on/off?

madshi commented 9 years ago

(Samples upload complete.)

Nevcairiel commented 9 years ago

Negative values will use the full 32-bit, as due to the way negative numbers are encoded in Two's Complement, but you can just use the 24-bit in the LSBs and get the proper value still.

The absolute value of the audio data will never exceed 24-bit, I checked! ;)

madshi commented 9 years ago

Good to know, thanks Nevcairiel.

foo86 commented 9 years ago

Then I've run XLL decoding tests, using my XLL sample collection with various format combinations. I'd say about 60-70% of them decoded identical to the ArcSoft DTS Decoder. However, several of them failed to decode properly.

I'm uploading samples right now. Contained are the original DTS files and ArcSoft decoded WAV files, so you have a reference to compare to.

Thanks for the samples, I will investigate what's going on. Some of the problems you describe are clear signs of some corruption/misparsing.

BTW, does the XLL extension have some sort of CRC check so that dcadec can check if the final decoded result is really bit perfect? I know that TrueHD has such a CRC. Not sure about XLL, though. I could imagine that XLL itself can be checked, but maybe not the final core+XLL combined data?

There is no way to check losslessness/correctness of the output, as far as I can tell. A major flaw in the DTS format. There is a provision for CRC checksums, but for encoded data only, and even those are mostly useless. The standard requires DTS core checksums to be ignored and all XLL samples I've seen didn't contain checksums for XLL band data (they are optional by the standard).

So far the only way to more or less verify sanity of the PCM output is to check that combined DTS core and XLL residual samples don't overflow their corresponding bit width.

The "samples" buffer always seems to be 32bit? For lossless decoding I suppose I can simply extract the lower 16/24 bit and ignore the other 8/16 bit, correct?

Yes, samples are signed 32-bit integers with 16 or 24 less significant bits containing actual data (as indicated by bits_per_sample parameter). MSBs contain sign bits as usual in two's complement. Lossy decoder output is always 24 bits (unless a special DCADEC_FLAG_CORE_SOURCE_PCM_RES flag is set), no need to dither or reduce to source PCM resolution.

Hmmmm... I'm wondering what the following flag does? /* Force bit exact DTS core decoding /

define DCADEC_FLAG_CORE_BIT_EXACT 0x02

What is the advantage/disadvantage of this flag? Is it better quality with it on/off? Is it slower with it on/off?

It forces use of fixed point (bit exact) DTS core interpolation algorithm. Normally it is only used when core samples are to be combined with lossless residual. There's no advantage in using this for decoding lossy DTS core alone unless you are computing checksum of the output for some kind of regression testing. Floating point (bit inexact) interpolation has better precision and this is what should be used for lossy decoding. As for the speed, I've not benchmarked these algorithms. Floating point may be slightly slower because it uses brute-force O(N^2) IDCT implementation while fixed point uses more sophisticated (also called "obfuscated") IDCT implementation from the spec (should not be a big deal).

madshi commented 9 years ago

Thanks.

FWIW, I think those DTS tracks are clean, not damaged. But then, I wouldn't bet my life on it.

foo86 commented 9 years ago

FWIW, I think those DTS tracks are clean, not damaged. But then, I wouldn't bet my life on it.

I didn't mean those tracks are corrupted, I meant it sounded to me like dcadec misparsed them or produced corrupt output.

1) dcadec.exe reports "Failed to decode block code + invalid bitstream format".

Those two tracks are in .m2ts container. dcadec can't parse those, it can only read raw DTS. After extraction DTS tracks decode fine.

2) dcadec.exe reports "PCM output parameters changed". libdcadec.dll's dcadec_context_filter() function suddenly changes bits_per_sample from 16 to 14 with one of the samples, when this happens. And decoded data is not correct, anymore.

There were two kind of problems here. First, there were 2 samples where actual PCM bit resolution differed from storage bit resolution in some frames as indicated by XLL channel set header. libdcadec_context_filter() didn't handle that, hence it returned varying bit depth. 9cbecd3 makes it transparent to API user by padding samples to storage bit depth.

Second "problem" is not really a problem, but a consequence of the fact that remaining sample tracks in this category are badly cut at the end. (Why this happens with cut tracks was already discussed, see issue #4).

3) dcadec.exe reports "PCM output overflow".

This was a bug triggered by a badly cut file. Should be fixed by 3f1b97c.

4) Decoding runs through without any errors, but the final result doesn't match the ArcSoft decoder.

I believe these are reference decoder issues. I've seen similar differences between libdcadec and reference decoder output with synthetic sample files encoded by reference encoder. libdcadec decoded such tracks losslessly, while reference decoder did not.

I don't know whether this is a bug in reference decoder, or whether it follows some kind of stream metadata and intentionally applies unknown post-processing that libdcadec doesn't.

5) Everything is fine, but dcadec.exe stops at "Progress: -95%". Just a cosmetical issue.

Couldn't reproduce this. I'm not even sure how dcadec_stream_progress() can return negative value except -1 on error.

madshi commented 9 years ago

The bugfixes seem to work fine, thank you.

Those two tracks are in .m2ts container. dcadec can't parse those, it can only read raw DTS. After extraction DTS tracks decode fine.

Ah, how stupid of me, of course! Originally I did all the tests with eac3to, which parses the m2ts file and feeds dcadec just with the DTS data. Then I compared all decoded tracks and run all which didn't match the ArcSoft decode through dcadec.exe. That's how I ended up running the m2ts files through dcadec.exe, which of course doesn't make sense.

However, there's still an issue with both of these samples: It seems that the channel order output by dcadec doesn't match the ArcSoft decoder channel order for these two files. The back surround and surround channels are swapped. I'm not sure which is correct, though. In any case, this is another separate problem: Mismatching channel order compared to the ArcSoft decoder. Must be a bug in either dcadec, or ArcSoft or eac3to. Can you check?

Same problem with the following 3 tracks in the "PCM output parameters changed" folder: Again surround channels are swapped with the back surround channels, compared to ArcSoft:

5.1 ES 24 48 1536 2012 XLL 7.1.dts
Rambo 4 Blu-Ray - 5.1 24 48 1536 2012 XLL 7.1.dts
The Orphanage Blu-Ray 5.1 24 48 1536 2012 XLL 7.1.dts

The X-Men 3 sample doesn't complain about PCM output overflow, anymore, but the decoding at the start of the file is different compared to ArcSoft. The ArcSoft decoder has all zeroes, while dcadec produces different data. My subjective impression is that the ArcSoft output with all zeroes is more likely to be lossless, but of course it's only a stomach feeling. What do you think? FWIW, it's only the start of the file. The rest is identical between ArcSoft and dcadec.

I can reproduce the -95% progress result every time. Might be Windows specific? Here's the command line log:

D:\Desktop\dcaDecBugs\Progress -95%>dcadec.exe "Hatchet II - 5.1 16 48 768 1024 XLL 7.1 (strange setup).dts" test.wav DTS-HD Master Audio: 8 ch, 48 kHz, 16 bit (DTS Core Audio: 5.1 ch, 48 kHz, 16 bit, ES, 768 kbps) Decoding... Progress: -95% Completed.

I don't really care much about this problem, since it's really only cosmetical. Just wanted to let you know that it's fully reproducable here.

kasper93 commented 9 years ago

I can reproduce the -95% progress result every time. Might be Windows specific?

Works fine for me on Windows. Might be compiler specific? Seems something specific to your setup anyway.

madshi commented 9 years ago

Could be. Not important for me, anyway.

About the 7.1 channel swaps: I think it's caused by the HD speaker configuration:

1) C L R Ls Rs LFE Lw Rw (0x40f) -> ArcSoft and dcadec have the same output 2) C L R LFE Lsr Rsr Lss Rss (0x84b) -> ArcSoft and dcadec have backs and surrounds swapped 3) C L R Ls Rs LFE Lsr Rsr (0x4f) -> I'm not 100% sure because ArcSoft has problems with these, anyway, but I think the channels are also swapped

As written in my previous post, I'm not 100% sure if ArcSoft has the correct order or dcadec. If you have access to the DTS MA Encoder, maybe you can use that to find out definitely which is the correct order? I don't have the DTS MA Encoder, so I can't test that...

Nevcairiel commented 9 years ago

IME ArcSoft will use the wrong order, as it always tries to map the channels to a "standard" 7.1 mask, while dcadec will map it to a mask that best matches the actual DTS-HD channel ids.

The channel mask dcadec gives you should be rather enlightening on why it orders them that way.

madshi commented 9 years ago

Many many users have used eac3to & ArcSoft to reencode their Blu-Rays to MKV+FLAC. If the dcadec channel order is correct and the ArcSoft channel order is incorrect, then that means that all those users will have to redo many of their 7.1 remuxes. This is not a small thing, so we should not guess here, but we need to be 100% sure. So I would feel much safer if somebody who has access to the DTS MA Encoder could make a channel test encoding with the 3 available 7.1 speaker configs.

One potential problem could be that the person who encodes the samples might already have to make some decisions, which channels to use for which speaker configs. Based on that the tests could lead to incorrect results if the Blu-Ray encoding houses have made different decisions. So maybe using the DTS MA Encoder is not the conclusive test. Maybe we will have to test actual Blu-Ray 7.1 tracks and try to figure out based on the contents of the channels which is supposed to be side surround and which back channels?

In any case, we need to be 100% sure on this.

Nevcairiel commented 9 years ago

A different order doesn't necessarily mean that the actual channel <-> speaker mapping is different (in a common setup, anyway). The channel layout defines this.

In WAV terms, a SL/SR BL/BR combination will have a different channel order than a FLC/FRC BL/BR layout, but during playback, they will likely map to the same speakers, since noone is going to have different speakers for Front-surrounds and side-surrounds.

AFAIK, ArcSoft doesn't make this distinction at all, and outputs 7.1 in the same layout always. ArcSoft is really anything but reference. It includes a reference decoder, but the ArcSoft wrapper around it does so many bad things to the channel layouts...

madshi commented 9 years ago

Here's what I found out so far:

DTS speaker config: C L R LFE Lsr Rsr Lss Rss (0x84b) DTS speaker config: C L R Ls Rs LFE Lsr Rsr (0x4f) eac3to/ArcSoft WAV channelmask: FL FR FC LFE BL BR SL SR (0x63f) dcadec.exe WAV channelmask: FL FR FC LFE BL BR SL SR (0x63f)

Channel order does not match, although the WAV channel mask matches.

DTS speaker config: C L R Ls Rs LFE Lw Rw (0x40f) eac3to/ArcSoft WAV channelmask: FL FR FC LFE BL BR SL SR (0x63f) dcadec.exe WAV channelmask: FL FR FC LFE FLC FRC SL SR (0x6cf)

Channel order matches, although the WAV channel mask does not match.

That means, regardless of DTS speaker config and WAV channel mask, eac3to/ArcSoft and dcadec always disagree about the 7.1 channel order. So if dcadec is really right, all remuxers will have to redo their 7.1 FLAC files, if they were created with the ArcSoft decoder... :(

One other thing:

I don't think it's a good idea for dcadec to use the WAV channelmask "FL FR FC LFE FLC FRC SL SR" (0x6cf). Why? Because that setup doesn't match anybody's 7.1 speaker setup. And because in my experience the encoding houses for DTS 7.1 tracks randomly select one of the 3 speaker configs mentioned above, but the channels are really always the same as in TrueHD encodings. So I believe we should simply "ignore" the 7.1 DTS speaker config because that's the best way to handle the content that's out there. And the WAV files should always be 0x63f because that's what matches the 7.1 speaker setup of 99% of all users. Just my 2 cents, of course.

Nevcairiel commented 9 years ago

The decoder should match the dca spec as close as possible. If you want to remap that to what people expect in your software, then you should really do it there.

madshi commented 9 years ago

We're not talking about the dca decoder here, but about how to convert the DTS speaker configuration to the WAV channelmask, which is not defined by the dca spec. This conversion is open for interpretation to a certain extend. We have the following three DTS 7.1 speaker angle combinations:

1) 0°, 30°, 60°, 110° 2) 0°, 30°, 90°, 150° 3) 0°, 30°, 110°, 150°

This is as much as the dca spec says. How this converts to WAV channelmask, we have to decide for ourselves. dcadec.exe currently uses SPEAKER_FRONT_LEFT_OF_CENTER/RIGHT_OF_CENTER, which makes no sense to me. I would interpret SPEAKER_FRONT_LEFT_OF_CENTER/RIGHT_OF_CENTER to be 15°. IMHO all of those 3 DTS speaker configs should map to WAV 0x63f.

If dcadec outputs WAV files with a channelmask of 0x6cf, and if that WAV data is sent to an audio renderer (or receiver) with this channelmask, only God knows what kind of violence (read: processing) will be done to our precious audio data.

Anyway, it's foo86's decision, of course, and in the end it's not important for me. eac3to will definitely always produce 0x63f, because that's the only way we can be sure that the audio track will be played as expected.

Nevcairiel commented 9 years ago

The current code assigns one fixed WAV speaker to every DCA speaker, which wouldn't allow what you suggest. If it would be changed, it would need to act based on context - which other speakers are present.

Example: L R C LFE Ls Rs Lw Rw Ls/Rs should be BL/BR, and Lw/Rw should be SL/SR

L R C LFE Ls Rs Lsr Rsr Here, Ls Rs should be SL/SR, and Lsr/Rsr should be BL/BR

The fixed 1:1 assignment the code does now, does not allow for this. To avoid a clash, Ls/Rs currently is always SL/SR, Lsr/Rsr to BL/BR and Lw/Rw maps to FLC/FRC, which is a similar speaker setup, just rotated/angled more to the front, although the argument isn't without merit that FLC/FRC are in fact meant to be the 15° speakers (ie. in a 5 front-speaker setup).

Of course I cannot say if foo86 would be open to building a context-aware channel mapper, instead of the fixed 1:1 assignments. The fixed assignments make it easier, though.

Nevcairiel commented 9 years ago

I encoded the three 7.1 configs that the DTS-HD Encoder Suite seems to offer using Channel ID audio, which makes it easier to spot problems.

http://files.1f0.de/samples/dts/Lss-Lsr.dtshd http://files.1f0.de/samples/dts/Ls-Lsr.dtshd http://files.1f0.de/samples/dts/Lw-Ls.dtshd

The names should hopefully explain what they are. ;) The first two decode exactly the same with dcadec, both as SL/SR BL/BR with appropriate channel order.

The third uses the FLC/FRC channels, but when considering the order and spatial position of those, it still seems correct (any arguments on the correctness of that mapping aside)

ArcSoft seems to do crazy things in my setup on the both the second and the third, it mixes the channels oddly, presumably to move them spatially to the "default" angle. But I don't know if there is a version of the DLL that didn't do this, I'm not all that up into the specifics there.

madshi commented 9 years ago

Thanks!! And dcadec decodes all 3 files losslessly (bit perfect)? Ok, I guess I will have to accept that all DTS 7.1 tracks decoded by eac3to/ArcSoft need to be redone... :(

One big problem with assigning 15° to Lw/Rw is that 15° is on the wrong side of the L/R speakers. Lw/Rw is supposed to be farther away from C than L/R. But when using WAV FLC/FRC for Lw/Rw, they are actually nearer to C than L/R, which is the wrong order, and which will destroy sound effects, if the playback chain honors the WAV channel assignments. E.g. imagine a car driving from Lw -> L -> C -> R -> Rw. When using 0x63f, the car will move at a constant speed from the left side to the ride side, using all 5 front/surround speakers. When using 0x6cf, the car sound will start between L and C, then it will slowly move left towards L, and then suddenly it will move at double speed right from L to C. Then it will quickly move right from C to R, and then slowly move back left from R to C.

Nevcairiel commented 9 years ago

Yes, the audio is bitexact to the original WAVs, except a bit of extra silence at the beginning that the encoder seems to add, all decoders reproduce it.

foo86 commented 9 years ago

@madshi So you propose the following mapping of DCA speakers to WAV for the 3 most commonly used 7.1 layouts (here I reordered WAV channels to match corresponding DCA speakers):

(C + L + R + Ls + Rs + LFE1 + Lw + Rw) => (FC + FL + FR + BL + BR + LFE + SL + SR). Interesting that this is nearly what ArcSoft was doing with a sample from issue #1, minus the post-processing of wide fronts.
(C + L + R + LFE1 + Lss + Rss + Lsr + Rsr) => (FC + FL + FR + LFE + SL + SR + BL + BR).
(C + L + R + Ls + Rs + LFE1 + Lsr + Rsr) => (FC + FL + FR + SL + SR + LFE + BL + BR).

dcadec already remaps layouts 2 and 3 this way. Layout 1 is remapped to (FC + FL + FR + SL + SR + LFE + FLC + FRC). This mapping is what libavcodec's native DCA decoder uses. With the argumentation you provided, I think it reasonable to remap this layout to 0x63f as well. @Nevcairiel I suppose this change shouldn't confuse libavcodec much?

However, there's still an issue with both of these samples: It seems that the channel order output by dcadec doesn't match the ArcSoft decoder channel order for these two files. The back surround and surround channels are swapped. I'm not sure which is correct, though. In any case, this is another separate problem: Mismatching channel order compared to the ArcSoft decoder. Must be a bug in either dcadec, or ArcSoft or eac3to. Can you check?

There is indeed something strange about the decoded WAV files you provided, the surround channels seem swapped for all 7.1 samples. I'm quite positive that dcadec order is correct for layouts 2 and 3. dcadec order matches what I get with eac3to 3.28 and ArcSoft 1.1.0.0 for your samples (definitely matches for layout 2 and I think matches for layout 3 since ArcSoft output seems to be broken for this layout).

The X-Men 3 sample doesn't complain about PCM output overflow, anymore, but the decoding at the start of the file is different compared to ArcSoft. The ArcSoft decoder has all zeroes, while dcadec produces different data. My subjective impression is that the ArcSoft output with all zeroes is more likely to be lossless, but of course it's only a stomach feeling. What do you think? FWIW, it's only the start of the file. The rest is identical between ArcSoft and dcadec.

The first frame in a residual encoded HD MA track can't be decoded losslessly, so, technically, both variants are not lossless. 0a8cd1c brings libdcadec output in line with reference decoder (there were a few other samples with this issue).

So far I didn't notice anything more abnormal about how dcadec decodes these samples, except that LFE channel for Orchestra Demo and Sfx Demo tracks doesn't match ArcSoft. I haven't figured out why yet.

Nevcairiel commented 9 years ago

As long as the channel layout matches the channel order, avcodec (and any other apps using libdcadec, really) should be fine.

madshi commented 9 years ago

Yes, I think those channel assignments you listed would be best.

You're saying the WAV files you get from eac3to 3.28 + ArcSoft 1.1.0.0 have a different channel order than those I've provided in the zip file? That confuses me very much - I don't know why that would be the case! I've been using 3.28 + ArcSoft 1.1.0.0 here, too, to create those WAV files...

I've some more non-XLL samples (XBR, X96k, XCh, XXCh, XSA etc), would you be interested in checking them, too? I planned to do that myself, but I don't have a lot of time atm, so I don't know when I'll get to that.

foo86 commented 9 years ago

As long as the channel layout matches the channel order, avcodec (and any other apps using libdcadec, really) should be fine.

Yes, I think those channel assignments you listed would be best.

OK, will provide a special mapping for wide 7.1 layout then.

You're saying the WAV files you get from eac3to 3.28 + ArcSoft 1.1.0.0 have a different channel order than those I've provided in the zip file? That confuses me very much - I don't know why that would be the case! I've been using 3.28 + ArcSoft 1.1.0.0 here, too, to create those WAV files...

Exactly. These files don't match what I'm getting locally in a WinXP VM with the mentioned software versions:

Orchestra Demo - 5.1 24 48 1536 2012 XLL 7.1 96a.wav
Sfx Demo - 5.1 24 48 1536 2012 XLL 7.1 96a.wav
5.1 ES 24 48 1536 2012 XLL 7.1a.wav
Hi-Def Demo Blu-Ray - 5.1 24 48 1536 2012 XLL 7.1 (strange setup) 96a.wav
Rambo 4 Blu-Ray - 5.1 24 48 1536 2012 XLL 7.1a.wav
The Orphanage Blu-Ray 5.1 24 48 1536 2012 XLL 7.1a.wav

For reference, sha1sum of dtsdecoderdll.dll I have is 81a344bfea293c0c4eceded92fe69fcc5a28c656.

I've some more non-XLL samples (XBR, X96k, XCh, XXCh, XSA etc), would you be interested in checking them, too? I planned to do that myself, but I don't have a lot of time atm, so I don't know when I'll get to that.

It'd be good if you could share them. I don't have many real world samples.

madshi commented 9 years ago

Strange, my 1.1.0.0 sha1 is different. And I've just checked with 1.1.0.8 and it results in a different channel order. Ouch. Your 1.1.0.0 is probably slightly newer than mine, I would guess...

Here's the rest of my DTS sample collection:

http://madshi.net/dtsSamples2.zip (632MB)

Upload will take about 2 hours. I've just seen there's one sample with "dialnorm" in it. I hope your decoder ignores dialnorm by default? I hate dialnorm...

madshi commented 9 years ago

(upload complete)

foo86 commented 9 years ago

(upload complete)

Thanks. Will look them through for obvious decoding issues.

I've just seen there's one sample with "dialnorm" in it. I hope your decoder ignores dialnorm by default? I hate dialnorm...

Dialnorm is not implemented.

merbanan commented 9 years ago

@Nevcairiel the added silence is that always a fixed amount ? And should it be removed like in mdct type codecs ?

Nevcairiel commented 9 years ago

I figure its just something the encoder does. Every single decoder I tried did output the silence, so the behavior is consistent across the board. I wouldn't worry about it.

merbanan commented 9 years ago

@Nevcairiel foo86 wrote that "The first frame in a residual encoded HD MA track can't be decoded losslessly" that might be the reason that there is some silence added to the stream in the beginning. If the silence added is of the same length as the lossy part it would probably be the reason. And I think it would be good to have an option to trim this part of the steam for easy regression testing.

foo86 commented 9 years ago

My experience shows that the encoder always adds a zero frame at the beginning to clear the core filter bank history. Then there is one frame where core filter bank is "ramped up" and doesn't yet produce meaningful data. (In lossless files, residual part of the second frame effectively cancels lossy part so that resulting audio is also zeros). Then original audio begins. At the end of the DTS stream there is padding for complete frame. (Frame size is 512 samples usually for lossy and residual encoded streams, but larger for pure lossless streams). Provisions in DTS spec for "partial core frames" and such are never actually used I think.

If you seek into the random part of lossy or residual encoded DTS stream you won't get meaningful results from the core filter bank since it doesn't have history from the previous frame. The padding at beginning of the stream ensures this doesn't happen for sequential decoding.

The 2 frames of audio at the beginning can be trimmed safely. (EDIT: actually only 1 frame. There are encoders, e.g. dcaenc, that don't add the first zero frame). But no decoder actually does this I think. For now it made regression testing actually easier for me as I could do a binary diff between dcadec output and reference decoder output.

filler56789 commented 9 years ago

IIRC, and FWIW, the 'official' DTS-HD StreamPlayer always removes the additional silent samples. But it depends (or was designed to depend) on the original DTS-HD headers, and therefore it's useless for the streams extracted from the Blu-Ray discs.

filler56789 commented 9 years ago

I have just downloaded the zipball, and it compiled fine under MinGW-w64 & GCC 4.9.2, thanks for that.

One minor issue: dcadec does not calculate the actual bitrate of the stream, it blindly trusts the target bitrate info from the frame headers '-'

BTW: any plans of adding support for DTS Express and/or pure-lossless DTS?

foo86 commented 9 years ago

IIRC, and FWIW, the 'official' DTS-HD StreamPlayer always removes the additional silent samples. But it depends (or was designed to depend) on the original DTS-HD headers, and therefore it's useless for the streams extracted from the Blu-Ray discs.

I suppose DTS-HD container has metadata indicating exactly how many frames were added.

any plans of adding support for DTS Express and/or pure-lossless DTS?

Pure lossless DTS should be already supported. LBR might be eventually supported, although I'm not very interested in implementing it.

Here's the rest of my DTS sample collection:

@madshi I've looked through your samples and they all seem to decode correctly, with exception of LBR encoded ones and a sample that contained weird unsupported XCH audio mode in one frame (XCh 6.1 16 48 1536 2013.dts).

madshi commented 9 years ago

Great, thanks!

If I may ask: What is your opinion about the current "reliability" of dcadec? I will definitely add it as an option to the next eac3to build. I'm wondering whether it's already reliable/stable enough to use as the default option for all DTS formats (except LBR) in your opinion? Usually I'm a bit more careful about making a brand new decoder the default option. Are you worried about the LFE mismatch with the Orchestra+SFX samples? Or are there indications that ArcSoft might have decoded them incorrectly? Maybe I should make dcadec default only for 7.1 channels for now? Would be great to hear your opinion about that.

Are there any important features still missing? From what I've seen/read, LBR support is missing, and support for sample rates higher than 96kHz, correct? Anything else? If I know exactly what's missing, I can make ArcSoft default for those features.

Nevcairiel commented 9 years ago

192k XLL seems to decode fine on my end, bit exact to ArcSoft at that. Can't say if that applies to all samples though, I only have one concert Blu-ray.

Having LBR support would make a neat and round package, but admittedly LBR isn't all that important of a feature in the real world.

madshi commented 9 years ago

IIRC LBR is used by some commentary tracks on Blu-Ray? I think it doesn't have a core, so ArcSoft is currently the only choice to decode it?

Nevcairiel commented 9 years ago

That is correct. However, its just commentary, so meh ;)

madshi commented 9 years ago

Yeah, furthermore I think it's only for integrated Blu-Ray menu controlled feature stuff. The "conventional" commentary audio tracks are usually still proper AC3 or standard DTS.

foo86 commented 9 years ago

If I may ask: What is your opinion about the current "reliability" of dcadec? I will definitely add it as an option to the next eac3to build. I'm wondering whether it's already reliable/stable enough to use as the default option for all DTS formats (except LBR) in your opinion? Usually I'm a bit more careful about making a brand new decoder the default option.

Without proprietary decoders installed, eac3to is limited to decoding DTS via libavcodec, right? If so, I'd say it's safe to decode via libdcadec by default as it should clearly provide an improvement over libavcodec's internal decoder. As for libdcadec vs ArcSoft choice, I'd say blacklist channel layouts that ArcSoft can't decode properly and decode these via libdcadec and add an option to make libdcadec default for all layouts.

Are you worried about the LFE mismatch with the Orchestra+SFX samples? Or are there indications that ArcSoft might have decoded them incorrectly?

It always worries me when libdcadec's output is different and I can't explain the difference :) This is the case right now with these samples. Can't say which one is correct yet.

Are there any important features still missing? From what I've seen/read, LBR support is missing, and support for sample rates higher than 96kHz, correct? Anything else? If I know exactly what's missing, I can make ArcSoft default for those features.

Aside from LBR and XLL sample rates > 96 kHz, none I'm aware of. There are two outstanding known bugs: LFE channel mismatch with 96 kHz core+XLL samples and failure to parse XCH frames with strange audio mode. Both should be quite uncommon.

192k XLL seems to decode fine on my end, bit exact to ArcSoft at that. Can't say if that applies to all samples though, I only have one concert Blu-ray.

I'd never expect 192 kHz core+XLL to work. Are you sure it is really 192 kHz, not 96 kHz? Does it have DTS core?

madshi commented 9 years ago

Thanks. Oh, so the LFE channel problem is (probably) related to 96kHz? That's good to know, and makes the (potential) issue much less critical.

Is there an easy way for eac3to to detect which XCh frames make problems and which don't? E.g. is there a specific bit set at a specific offset of the XCh frame that I could look for, or something like that? I mean XCh is quite common, so I can't just reject all XCh tracks. Or maybe you could make the dcadec_context_parse() or dcadec_context_filter() APIs fail, if they encounter such a problematic XCh track? Then at least eac3to could abort processing properly, instead of outputting corrupted (or sub-quality) results.

Nevcairiel commented 9 years ago

I'd never expect 192 kHz core+XLL to work. Are you sure it is really 192 kHz, not 96 kHz? Does it have DTS core?

You are right, it doesn't work. I must've failed at testing earlier. I can extract a sample from this disc if you want it?

foo86 commented 9 years ago

Oh, so the LFE channel problem is (probably) related to 96kHz?

Yes, must be 96 kHz related.

Is there an easy way for eac3to to detect which XCh frames make problems and which don't?

No need for workaround, I will push a fix soon (just made some progress, what an obscure bug it was, DTS core design that requires searching for extension sync words really sucks).

You are right, it doesn't work. I must've failed at testing earlier. I can extract a sample from this disc if you want it?

No need to, I have a 192 kHz sample. It's just not trivial to implement.

madshi commented 9 years ago

Great - thank you! :)

madshi commented 9 years ago

P.S: One small request: It would be great if you make dcadec error out (for now) if there's anything unexpected. E.g. if there's some extension that you can't parse, or if an audio track looks to you like it's damaged/corrupted (except for a last incomplete frame, which you can safely ignore), I'd much prefer it dcadec refuses to decode at all.

The reason for this request is that it would make it much easier for eac3to users to run your decoder through some serious testing. If dcadec errors out if there's any indication of problems, I could provide you with the problematic audio tracks to look at. If dcadec doesn't error out but tries to do the best it can, nobody might notice that there's a problem, and potential bugs might not be fixed.

Once dcadec has some weeks/months of testing done to it by a wide user audience, it might then make sense to revert this behaviour and try to decode as best as you can. But for now I'd much much prefer straight erroring out.

That said, Nevcairiel might have the exact opposite wish, because he needs dcadec for real time playback. So maybe there should be an option to tell dcadec to either error out or try to do its best, when any kind of problems are detected?

ghost commented 9 years ago

Just make it an option. FFmpeg decoders have such flags too. Something like strict vs. lenient decoding.

foo86 commented 9 years ago

Agreed, having such option will be a definitely good idea.

foo86 commented 9 years ago

LFE channel interpolation issue at 96 kHz should be fixed now. I've also added strict decoding mode that can be enforced by setting DCADEC_FLAG_STRICT. It catches errors during extension decoding and returns failure instead of silently skipping them. In non-strict mode, fallback to lossy core decoding has been implemented when XLL has unsupported configuration.

madshi commented 9 years ago

Wonderful, thanks a lot! Are all three different 7.1 speaker configs already routed to 0x63f in the meanwhile? If so, I guess I'm good to go with a new eac3to release!

foo86 / dcadec

XLL decoding issues #8

define DCADEC_FLAG_CORE_BIT_EXACT 0x02

define DCADEC_FLAG_CORE_BIT_EXACT 0x02