simon-fu / mp4v2

Automatically exported from code.google.com/p/mp4v2
Other
0 stars 0 forks source link

Size of sound atom should be 36 bytes instead of 34 #26

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Create a alaw atom by calling InsertChildAtom(parent, "alaw", 0)
2. Play with VLC and see in the debug messages that the alaw atom is
skipped because it has a size of 34 bytes
3.

What is the expected output? What do you see instead?
I expected to hear audio, but VLC doesn't parse the alaw atom because the
size is too short. Windows Media Player even crashes when playing the file.

What version of the product are you using? On what operating system?
Version 1.9.0

Please provide any additional information below.
I created a .mp4 file with MPEG-4 video and ALAW audio, but VLC was not
able to play the audio. WMP even crashes when I wanted to play the file
with it. After debugging with the source code of VLC, I saw it was not
parsing the alaw atom, because its size was too short (34 bytes). I
transcoded another movie with FFMPEG to .mp4 with MPEG-4 video and ALAW
audio. The alaw atom in this file was 36 bytes, and I was able to play this
file with VLC and WMP. 

For now, I add a 2-byte long reserved field in each created sound atom.
This works for me...

Original issue reported on code.google.com by joostvan...@gmail.com on 13 Jul 2009 at 11:29

GoogleCodeExporter commented 9 years ago

Original comment by eddygroe...@gmail.com on 13 Jul 2009 at 11:37

GoogleCodeExporter commented 9 years ago
Any chance of a patch for this so that I don't have to redo what you have done 
already?

Did you check the specs for the alaw atom to see what the missing attributes 
are - rather than just padding the 
atom?

Original comment by eddygroe...@gmail.com on 13 Jul 2009 at 11:49

GoogleCodeExporter commented 9 years ago
Didn't check the specs for alaw atoms, couldn't find any.  So I just added a 
dummy
field of 2 bytes in the constructor of MP4SoundAtom (and yes, its quick and 
dirty,
but it works in my case):

// add a dummy field of 2 bytes, because this atom should have a size of 36 
bytes
AddReserved("dummy_field", 2);

Original comment by joostvan...@gmail.com on 14 Jul 2009 at 9:33

GoogleCodeExporter commented 9 years ago
We can't just add an arbitrary 2 bytes without knowing where and what for.

Look at MP4SoundAtom::MP4SoundAtom() for how we construct the sound atom for 
the different types. These are all based 
on the QT specs 
(http://www.monen.nl/DevDoc/documentation/QuickTime/QTFF/QTFFChap3/chapter_4_sec
tion_3.html) for 
a sound atom.

Note that there are two versions of the sound atom documented, but there is 
also a version 2, not sure where that is 
specified off hand.

AFAIK we are writing version 0 atoms by default.

void MP4SoundAtom::Generate()
{
    ...
    ((MP4Integer16Property*)m_pProperties[2])->SetValue(0);
    ...
}

I couldn't find anything special that is added for an alaw sound atom in the 
spec (see the section for "IMA, uLaw, and aLaw") - 
although it does sound as if it needs an extra property added, but I don't want 
to add it without knowing for certain.
 And if the problem is with all our Sound Atoms, then again I'd want to see where we diverge from what is in the spec.

Cheers, Ed.

Original comment by eddyg.o....@gmail.com on 28 Jul 2009 at 11:52

GoogleCodeExporter commented 9 years ago
I think there is a fundamental problem with the sound atom. According to the 
spec,
there is a 16 bit field called compressionID that sits between the sampleSize 
and
packetSize fields. This is missing in the sound atom constructor. This would 
explain
why the funky "reserved2" and "reserved3" fields are needed in the mp4a and the 
alac
cases. It is also causing the timescale to come out "right" by accident. Once 
this
was fixed, another minor change was required in AddAudioTrack to shift the time 
base
left 16 like it should be. 

I have attached a version of atom_sound.c with this problem fixed. The changes 
are
all in the mp4SoundAtom constructor and the AddProperties function. 

I also attached mp4file.cpp. There is a one line change in it at line 1236.

Original comment by DonEdval...@gmail.com on 1 Jun 2010 at 12:44

Attachments:

GoogleCodeExporter commented 9 years ago
Don,

Interesting--could you attach the specification you're referencing so I can 
verify? 
I just want to double check your changes, but some preliminary investigation I 
did
seems to validate your findings.

Thanks!

Original comment by kid...@gmail.com on 1 Jun 2010 at 4:47

GoogleCodeExporter commented 9 years ago
Kona's comments - as usual there is some spec smash between QTFF and 
ISO-14496-12;

QTFF: reference pages 76,134
- field_a: Reserved Six bytes that must be set to 0.
- field_b: Data reference index A 16-bit integer that contains the index of the 
data reference to use to retrieve data associated with samples that use this 
sample description.
- field_c: Version A 16-bit integer that holds the sample description version 
(currently 0 or 1).
- field_d: Revision level A 16-bit integer that must be set to 0.
- field_e: Vendor A 32-bit integer that must be set to 0.
- field_f: Number of channels A 16-bit integer that indicates the number of 
sound channels used by the sound sample. Set to 1 for monaural sounds, 2 for 
stereo sounds. Higher numbers of channels are not supported.
- field_g: Sample size (bits) A 16-bit integer that specifies the number of 
bits in each uncompressed sound sample. Allowable values are 8 or 16. Formats 
using more than 16 bits per sample set this field to 16 and use sound 
description version 1.
- field_h: Compression ID A 16-bit integer that must be set to 0 for version 0 
sound descriptions. This may be set to –2 for some version 1 sound 
descriptions; see “Redefined Sample Tables” (page 135).
- field_i: Packet size A 16-bit integer that must be set to 0.
- field_j: Sample rate A 32-bit unsigned fixed-point number (16.16) that 
indicates the rate at which the sound samples were obtained. The integer 
portion of this number should match the media’s time scale. Many older 
version 0 files have values of 22254.5454 or 11127.2727, but most files have 
integer values, such as 44100. Sample rates greater than 2^16 are not supported.
Version 0 of the sound description format assumes uncompressed audio in 'raw ' 
or 'twos' format, 1 or 2 channels, 8 or 16 bits per sample, and a compression 
ID of 0.

(See the QTFF spec for additional fields used when field_e (version) is 1)

ISO 14496-12: see SampleEntry and AudioSampleEntry
- const unsigned int(8)[6] reserved = 0;
- unsigned int(16) data_reference_index;
- const unsigned int(32)[2] reserved = 0;
- template unsigned int(16) channelcount = 2;
- template unsigned int(16) samplesize = 16;
- unsigned int(16) pre_defined = 0;
- const unsigned int(16) reserved = 0 ;
- template unsigned int(32) samplerate = { default samplerate of media}<<16;

It's clear that the sound atom should have these extra two bytes, and also the 
sample rate should be shifted 16 bits.  These changes should be in changeset 
#388.

One question: do we also need to update AddAC3AudioTrack and AddEncAudioTrack 
with the bitshift change?

Original comment by kid...@gmail.com on 10 Jun 2010 at 7:00

GoogleCodeExporter commented 9 years ago
Kidjan,

I was going to get back to you with the specification, but it looks like you 
already got it. To answer your question, I do believe the other two functions 
need to be updated also.

Don

Original comment by DonEdval...@gmail.com on 15 Jun 2010 at 3:58

GoogleCodeExporter commented 9 years ago
I checked out AddAC3AudioTrack, and I'm not so sure it needs to be corrected.  
The atom definition looks sort of like:

 AddReserved("reserved1", 6); /* 0 */

    AddProperty( /* 1 */
        new MP4Integer16Property("dataReferenceIndex"));

    AddReserved("reserved2", 8); /* 2 */

    AddProperty( /* 3 */
        new MP4Integer16Property("channelCount"));

    AddProperty( /* 4 */
        new MP4Integer16Property("sampleSize"));

    AddReserved("reserved3", 4); /* 5 */

    AddProperty( /* 6 */
        new MP4Integer16Property("samplingRate"));

    AddReserved("reserved4", 2); /* 7 */

...which is actually already sort of right, unless someone wants a fractional 
sampling rate (let's hope not...)  Likewise, when it goes to set the 
samplingRate, it does:

MP4Integer16Property* pSampleRateProperty = NULL;
    FindIntegerProperty(
        MakeTrackName(trackId, "mdia.minf.stbl.stsd.ac-3.samplingRate"),
        (MP4Property**)&pSampleRateProperty);
    if (pSampleRateProperty) {
        pSampleRateProperty->SetValue(samplingRate);
    } else {
        throw new MP4Error("no property", "ac-3.samplingRate");
    }

...a 16 bit value.  So it's basically ignoring the fractional part entirely, 
which is probably okay, and I think it should be correct?

Thoughts?

Original comment by kid...@gmail.com on 9 Oct 2010 at 9:16

GoogleCodeExporter commented 9 years ago
FWIW, Apple release an updated version of QTFF spec which describe Sound v2 
atom:

http://developer.apple.com/library/mac/documentation/QuickTime/QTFF/QTFFChap3/qt
ff3.html#//apple_ref/doc/uid/TP40000939-CH205-128916

Original comment by jddu...@gmail.com on 26 Jul 2011 at 10:03