Open mbunkus opened 3 years ago
EbmlUInteger
was just an example. All the other data type classes don't handle that case either.
I can confirm this bug.
Luckily it is fixable without changing the ABI. There is DefaultISset()
to tell if an element has a default value. And when the size is zero we should use that default value DefaultValue
instead of 0/"".
It seems libebml2 (used in mkclean and mkvalidator) has the same issue with a twist, it's possible to set elements to their default value when they are created. Except it's not used...
Matroska elements affected by this bug:
Elements with a default value of 0 are not affected.
BTW, are there files existing with a Length of 0 (to save space) ? Does mkvmerge use this feature for example ? It doesn't seem possible with libebml, even though the EbmlElement::SizeLength
variable is initialized to 0. That variable is used exclusively with CodedSizeLength()
, even in libmatroska.
In that function it's used as
if (SizeLength > 0 && CodedSize < SizeLength) {
// defined size
CodedSize = SizeLength;
}
Where CodedSize
is always 1 and 5. So a EbmlElement::SizeLength
value of 0 has no effect here. And the function never returns 0.
mkvmerge
does not use that feature. I don't know if there's any implementation that does. I'm not even sure if the original specs even say what value to use in this case. My guess is that all existing implementations simply return 0 for numeric types & an empty string for string types. Just like libebml.
I haven't seen a single file using this shortcut for elements-with-default, at least not consciously. I have very vague memory of seeing files which wanted to store 0 / the empty string, though I really don't have any more details on that.
Calling out @dericed who has a huge corpus of MKV files he can check. @mkver do you know if libavformat uses the "zero length" feature when muxing ?
Our website never explained how it's supposed to be used, nor the dedicated libebml/EBML website.
The known basic types are:
This is good as it means that this new feature can't break old valid files (at least for integers and floats).
So that should leave integers out of this issue. We can assume they were never written with a 0 length. It may not be the case for all muxers but they're probably not read correctly with well known demuxers.
That leaves \Segment\Tracks\TrackEntry\Language
as the only element that can potentially have issues. I know in VLC we assign the default value to the variable we read it from. If the size is 0 that value wouldn't change.
In libavformat it seems to be similar. The default value is first assigned and then the actual data read.
This element being such an important element I can hardly imagine a demuxer would fail to use its default value properly when reading, especially since in many cases the value is not written. And it's highly unlikely a length of 0 would be used.
In section "6.1. Data Size Format" the EBML RFC states:
At the moment this simply isn't done. For example, the
EbmlUInteger::ReadData
function always uses 0 if it's an Empty Element. This works for elements whose default value is 0, of course, but not for others.BTW: the
EbmlUInteger::RenderData
function doesn't take default values into account either. To be honest, I prefer it that way as it will be less confusing to not fully spec-compliant EBML readers (such as libebml in its current state).