Open dwsinger opened 4 years ago
We believe this should be fixed. We suggest:
removing the notion of presentation file
and rewriting 6.1.1, 6.1.2 with the following text:
"When represented according to the format defined in this part of the standard, a presentation may be stored in a single file or in multiple files, or it may even be delivered without the bytes being written in a file, for instance when streamed over a network and consumed on the fly. When split over multiple files, two different splitting options exist. In one option, one file contains the metadata for the whole presentation, and is formatted to this specification. The other files are not required to be formatted to this specification. They are used to contain media data, and may also contain unused media data, or other information. The format of these other files is constrained by this specification only in that the media data in them must be capable of description by the metadata defined in this specification. These other files may be ISO files, image files, or other formats. Only the media data itself, such as JPEG 2000 images, is stored in these other files; all timing and framing (position and size) information is in the ISO base media file, so the ancillary files are essentially free-format. If an ISO file contains hint tracks, the media tracks that reference the media data from which the hints were built shall remain in the file, even if the data within them is not directly referenced by the hint tracks; after deleting all hint tracks, the entire un-hinted presentation shall remain. Note that the media tracks may, however, refer to external files for their media data. In a second option, the media data is distributed over multiple files conformant to this specification. A first file contains some metadata valid for the whole presentation and possibly some media data and some metadata valid for a first part of the presentation. It also describes that additional files may be present. These additional files describe media and metadata for successive parts of the presentation. In more complex scenarios, the two options could be combined. In this specification, some boxes (called top-level boxes) are indicated as being at ‘file’ level, with the notation “Container: File”. This file corresponds to the single file when no other files are used; or when multiple files are used, to the virtual file formed by the concatenation of file containing the metadata for the first part of the presentation, with the other ISOBMFF compliant files in presentation order."
"presentation metadata wrapper" (1 occurrence)
In "6.1.2 Object Structure", it says:
"The sequence of objects in the file shall contain exactly one presentation metadata wrapper (the MovieBox)."
We suggest replacing it with:
"The sequence of objects in the file shall contain exactly one MovieBox."
The sentences using this term can easily be removed or the term replaced by "media information".
Just edited the above comments for readability.
This is improved in the 7th edition but more work is needed. a 'pull request' (edited file) would be appreciated.
The term "presentation" is used throughout the spec (~180 times) with different meanings. This is confusing. We propose to clarify it when possible and to replace it in other cases.
When used standalone, it usually means "rendering" or "a set of related media" as in the introduction:
It is currently defined as follows:
This definition is outdated. We suggest replacing the definition with the simple:
We also suggest rephrasing the introduction which has too many 'presentation'.
There is no formal term in the definition section but the semantics of the 'tfra' box (8.8.10.3) says:
We suggest moving this text as a term definition in the definition clause.
We find also that:
which seems consistent with our understanding and the definition above. We suggest moving that text as a note in the definition clause. But, we find in Annex A, A.4:
which is confusing "time stamp" and "time" and more importantly "decoding time stamp" and "presentation time".
We suggest fixing that sentence as follows, by replacing:
with:
Sometimes the term "movie presentation time" is used. We suggest removing "movie" (or always using it) as the "presentation time" is indeed in a "movie time".
The terms are "earliest presentation time" and "end presentation time" are used but don't seem ambiguous as they do consider the movie timeline (i.e. with edit list).
The different sections about RTP use the term of "presentation time stamp" with a different meaning:
This is wrong and should be fixed.
We suggest defining the term "composition order" (or "output order") as above and to use it consistently. Similarly, we suggest defining "decoding order" and using it consistently (versus "decode order").
Similarly, in 11.2, it says:
But in 6.1.2, the COR mixes this term with the notion of segment: