google / ExoPlayer

This project is deprecated and stale. The latest ExoPlayer code is available in https://github.com/androidx/media
https://developer.android.com/media/media3/exoplayer
Apache License 2.0
21.74k stars 6.03k forks source link

Seeking in M4V file #4197

Closed petersamokhin closed 6 years ago

petersamokhin commented 6 years ago

Hello. I have big M4V file that made by concatenating of init part and a many chunks of M4V video from source MPEG-DASH broadcast. This file is ok, I can play it by any player on my mac, or get metadata by ffmpeg, etc. ExoPlayer can play this video too.

But duration recognized incorrectly. And I can't seek in video.

I thought that if I change the necessary part of file, all will be ok.

I found that ExoPlayer gets first sidx (container?) from file (this file handled by FragmentedMp4Extractor), then gets [20..23] bytes from start of this sidx, and then use this long value as duration of video in milliseconds.

I changed this value to my, and then when I watching video, the displayed duration is correct. But I still can't seek.

After clicking on any seek button, or swiping to any position by seekbar, video won't playing anymore.

Is it ExoPlayer's bug or my headache? :)

  1. If ExoPlayer can correctly play the video, why he can't seek?

  2. ExoPlayer seeking in the video based on what? Why correct duration can't help him to do this? (value returned by SimpleExoPlayer.getDuration is equal to value that was read from sidx container from file).

  3. How I can change this file to make it fine for ExoPlayer?

Thanks!

AquilesCanta commented 6 years ago

This issue is being closed because it does not adhere to our issue template, and/or because it omits information requested in the issue template that is required for us to investigate the problem efficiently. The issue template can be found here.

If you’re able to provide complete information as requested in the issue template, please do so below and we’ll re-open the issue. Thanks!

ojw28 commented 6 years ago

ExoPlayer uses the sidx box for seeking in FMP4. The sidx box doesn't just contain the duration. It provides a time->byte mapping for every segment in the file, which tells the player the byte offset from which it should request data when a seek is performed to a specified time.

For DASH streams, the manifest may provide an equivalent index (by listing each of the segments along with their durations, so the player knows from which segment it should request data).

The problem with what you're doing is that you're throwing away the latter, and not providing a complete sidx box to replace it. For the file to be seekable you'd need to construct a proper sidx box that provides a time->byte mapping for every segment.

If ExoPlayer can correctly play the video, why he can't seek?

Being able to play the video isn't sufficient information to enable seeking. We need a time->byte mapping as explained above. For some formats we're starting to support a type of seeking that uses directed binary search into the stream, for when an index isn't provided, but we have no plans to support this for FMP4.

Is it ExoPlayer's bug or my headache? :)

Your headache, I'm afraid ;).

petersamokhin commented 6 years ago

@ojw28 I need to save and play big video parts (250-300mb and more), but converting video by ffmpeg or by something else takes too much time.

But all players that I tried to use on my mac (include usual Google Chrome) can play and seek these videos normally. Android's MediaPlayer or simple VideoView also can't seek these videos.

You told about time->byte mapping. How can I add this mapping to the file?

I have .mpd like this:


(click to expand) Source of file.mpd ```xml ```

And my result file is a simple concatenation of init part and a lot of audio-$time parts without any modifications. So maybe I can set information from SegmentTimeline or other part of mpd into .m4v files? Or time->byte mapping not means this?

ojw28 commented 6 years ago

How can I add this mapping to the file?

By building a proper sidx box at the start of the file. Alternatively by creating a DASH manifest that lists the downloaded segments, and treating it as DASH content. The second of these options is probably the easier of the two.

petersamokhin commented 6 years ago

@ojw28 I can't affect DASH manifest :( Can you tell me, please,

why (if ALL files must containt proper sidx container) duration of the entire video is displayed as value from first sidx box?

And what is the proper sidx box for ExoPlayer?

I'm asking this because many other players can play and seek this video — which means that my video's sidx boxes are proper for them and I need to adapt my files only for ExoPlayer.

ojw28 commented 6 years ago

As already explained above, the DASH manifest is clearly providing indexing functionality by listing all of the segments and their durations. You're throwing this information away, and not replacing it with anything else, when you concatenate the segments. A sidx box that would allow seeking in ExoPlayer would contain equivalent information (i.e. would have an entry for each segment, its time/duration and its byte offset in the concatenated file). If you look at the definition for a sidx box this should make sense.

I'm asking this because many other players can play and seek this video — which means that my video's sidx boxes are proper for them and I need to adapt my files only for ExoPlayer.

There are less efficient ways to seek in media, such as directed binary search. It's quite likely that some other players implement this and so can seek (albeit not efficiently) in the concatenated file. This is not something we plan on supporting for FMP4.

petersamokhin commented 6 years ago

@ojw28 sidx box specifications:

aligned(8) class SegmentIndexBox extends FullBox("sidx", version, 0) {
    unsigned int(32) reference_ID;
    unsigned int(32) timescale;
    if (version == 0) {
        unsigned int(32) earliest_presentation_time;
        unsigned int(32) first_offset;
    } else  {
        unsigned int(64) earliest_presentation_time;
        unsigned int(64) first_offset;
     }
    unsigned int(16) reserved = 0;
    unsigned int(16) reference_count;
    for(i=1; i <= reference_count; i++) {
        bit (1) reference_type;
        unsigned int(31) referenced_size;
        unsigned int(32) subsegment_duration;
        bit(1) starts_with_SAP;
        unsigned int(3) SAP_type;
        unsigned int(28) SAP_delta_time;
    }
} 

ExoPlayer's FragmentedMp4Extractor.parseSidx is this code, translated to Java.

What I need to change or add in my files? earliest_presentation_time, subsegment_duration, or something else? Where does this come from?

Is t="213067" a time for time->byte mapping (of which you spoke) for each segment? If yes, where I need to put it? If not — what I need to get from manifest?

<SegmentTimeline>
    <S t="213067" d="1000"/>
    <S t="214067" d="1000"/>
    <S t="215067" d="1000"/>
    <S t="216067" d="1000"/>
    <S t="217067" d="1000"/>
    <S t="218067" d="1000"/>
    <S t="219067" d="1000"/>
    <S t="220067" d="1000"/>
    <S t="221067" d="1000"/>
    <S t="222067" d="1000"/>
</SegmentTimeline>
petersamokhin commented 6 years ago

@ojw28 and still I think that error is in ExoPlayer's logic...

I learned your code (specially FragmentedMp4Extractor.parseSidx), and I can tell now:

  1. earliest_presentation_time == t="213067" from DASH manifest (and it's correct in my files). We can make a correction for offset, because broadcast not always will be saved from it's start, but I don't think that this is all problems' reason. Moreover, I made this correction in all files (before concat), and it didn't help.
  2. All my files contain 1 second of video. Their sidx container contain only one reference with referenceDuration = 1001, and earliest_presentation_time pays attention to this. (And then all these files are simply concatenated)
  3. I tried to change referenceDuration from 1001 to 1001000, and it not helps too.
  4. ExoPlayer correctly read these values (earliest_presentation_time, durations and other), and maybe error not in sidx box or not in parsing?

Based on this, I think that sidx boxes of my files are correct. What did you mean by time->byte mapping? What parameters of sidx boxes I need to change?

ojw28 commented 6 years ago

We expect a single sidx at the start of the file that contains references to every segment (i.e. reference_count should equal the number of segments).

Think of a book index as an analogous structure. A book index is in one place, indexes the whole book, and tells you where everything is. Would that index be useful if instead it were scattered throughout the whole book, with a single entry in each place?

ojw28 commented 6 years ago

If you want to look at an example what what we expect from a FMP4 file, try inspecting this stream.

petersamokhin commented 6 years ago

@ojw28 thanks for response! Your example stream contains not sidx boxes :(

You told:

single sidx at the start of the file that contains references to every segment

And your parser:

for (int i = 0; i < referenceCount; i++) {
    int firstInt = atom.readInt();

    int type = 0x80000000 & firstInt;
    if (type != 0) {
        throw new ParserException("Unhandled indirect reference");
    }
    long referenceDuration = atom.readUnsignedInt();

    sizes[i] = 0x7FFFFFFF & firstInt;
    offsets[i] = offset;

    // Calculate time and duration values such that any rounding errors are consistent. i.e. That
    // timesUs[i] + durationsUs[i] == timesUs[i + 1].
    timesUs[i] = timeUs;
    time += referenceDuration;
    timeUs = Util.scaleLargeTimestamp(time, C.MICROS_PER_SECOND, timescale);
    durationsUs[i] = timeUs - timesUs[i];

    // atom.skipBytes(4);
    offset += sizes[i];
}

firstInt == referenced_size as far as I understand?

For example, I have ~3500 .m4v files (each ~66-67 kB size and 1001 ms duration). If I simply concatenate all these files, result file will be playable, but not seekable for ExoPlayer. Each file contain one sidx box ([28..72] bytes range, if I'm not mistaken (44 bytes). I need to add to first (non-init) file's sidx box info about all other files? referenced_size (firstInt) in first file should be 3500, and each reference should has referenceDuration == 1001, size (from sizes[i] of array) should be ~65-66 kB (is it right? Or size without headers and other?) And then I should simply concatenate init file, this first file (modified as told above), and other files without modifying?

ojw28 commented 6 years ago

Your example stream contains not sidx boxes :(

It contains a single sidx box that references all of the segments, which is what we need for FMP4 to be seekable. Anything involving multiple sidx boxes distributed throughout the content is not what we need.