shaka-project / shaka-player

JavaScript player library / DASH & HLS client / MSE-EME player
Apache License 2.0
7.19k stars 1.34k forks source link

Ignore unparseable streams in HLS if there is an alternative available #1665

Closed Lastique closed 1 year ago

Lastique commented 5 years ago

Have you read the FAQ and checked for duplicate open issues?: Yes.

What version of Shaka Player are you using?: Reproduces on 2.5.0-beta2 and 2.4.5.

Can you reproduce the issue with our latest release version?: Yes.

Can you reproduce the issue with the latest code from master?: I didn't try.

Are you using the demo app or your own custom app?: Custom app.

If custom app, can you reproduce the issue using our demo app?: I can't try as the stream is not accessible publicly.

What browser and OS are you using?: Kubuntu 18.10, Chrome 70.0.3538.102, Firefox 63.0.

What are the manifest and license server URIs?: No license server. The test page, manifest and media segments are attached.

What did you do? Open the test page and press play.

What did you expect to happen? The video should start playing.

What actually happened? The player fails to parse init segments with this error:

Unable to find timescale in init segment! hlsparser.js:1488:4 shaka.hls.HlsParser.prototype.getStartTimeFromMp4Segment hlsparser.js:1488:4 shaka.hls.HlsParser.prototype.getStartTime/< hls_parser.js:1431:13 load() failed: Object { severity: 2, category: 4, code: 4030, data: [], handled: false, message: "Shaka Error MANIFEST.HLS_COULD_NOT_PARSE_SEGMENT_STARTTIME ()", stack: "shaka.util.Error@https://cdnjs.cloudflare.com/ajax/libs/shaka-player/2.5.0-beta2/shaka-player.compiled.debug.js:97:784\nshaka.hls.HlsParser.prototype.getStartTimeFromMp4Segment@https://cdnjs.cloudflare.com/ajax/libs/shaka-player/2.5.0-beta2/shaka-player.compiled.debug.js:620:450\nshaka.hls.HlsParser.prototype.getStartTime_/<@https://cdnjs.cloudflare.com/ajax/libs/shaka-player/2.5.0-beta2/shaka-player.compiled.debug.js:619:26\n" } Shaka Error MANIFEST.HLS_COULD_NOT_PARSE_SEGMENT_STARTTIME () shaka.util.Error@https://cdnjs.cloudflare.com/ajax/libs/shaka-player/2.5.0-beta2/shaka-player.compiled.debug.js:97:784 shaka.hls.HlsParser.prototype.getStartTimeFromMp4Segment@https://cdnjs.cloudflare.com/ajax/libs/shaka-player/2.5.0-beta2/shaka-player.compiled.debug.js:620:450 shaka.hls.HlsParser.prototype.getStartTime_/<@https://cdnjs.cloudflare.com/ajax/libs/shaka-player/2.5.0-beta2/shaka-player.compiled.debug.js:619:26 player.js:922:4 Error code 4030 object Object { severity: 2, category: 4, code: 4030, data: [], handled: false, message: "Shaka Error MANIFEST.HLS_COULD_NOT_PARSE_SEGMENT_STARTTIME ()", stack: "shaka.util.Error@https://cdnjs.cloudflare.com/ajax/libs/shaka-player/2.5.0-beta2/shaka-player.compiled.debug.js:97:784\nshaka.hls.HlsParser.prototype.getStartTimeFromMp4Segment@https://cdnjs.cloudflare.com/ajax/libs/shaka-player/2.5.0-beta2/shaka-player.compiled.debug.js:620:450\nshaka.hls.HlsParser.prototype.getStartTime_/<@https://cdnjs.cloudflare.com/ajax/libs/shaka-player/2.5.0-beta2/shaka-player.compiled.debug.js:619:26\n" } index-shaka.html:66:3

The stream is generated by ffmpeg 4.1.

I'm not familiar with the shaka-player code base, but is it possible that in https://github.com/google/shaka-player/blob/1831ce95e272790c288516f6319f07b70d768a9a/lib/hls/hls_parser.js#L1473-L1485 there is a missing code for skipping 24-bit flags? See https://github.com/FFmpeg/FFmpeg/blob/752659327d4ac73640781376d214a26765f971f4/libavformat/movenc.c#L2720.

dash.tar.gz

joeyparrish commented 5 years ago

@Lastique, in the ffmpeg code you linked to:

    ffio_wfourcc(pb, "mdhd");
    avio_w8(pb, version);
    avio_wb24(pb, 0); /* flags */

ffmpeg writes the four-byte box type "mdhd", followed by the 1-byte version and 3-byte flags.

In our MDHD parsing code, we're using an abstraction that takes care of type, version, and flags before calling the callback (source permalink):

  let type = reader.readUint32();
  // ...
      let versionAndFlags = reader.readUint32();
      version = versionAndFlags >>> 24;
      flags = versionAndFlags & 0xFFFFFF;
  // ...
    let box = {
      parser: this,
      // ...
      version: version,
      flags: flags,
      reader: payloadReader,
      // ...
    };

So I don't think flags could be the problem.

I can reproduce the issue with the sample you attached (thanks!), but so far I don't see anything wrong in the code. Could it be that the error message ("Unable to find timescale in init segment!") is accurate? Could your init segment be malformed?

joeyparrish commented 5 years ago

I tested your init segments in this online inspection tool: http://thumb.co.il/

I found that init-stream1.m4s and init-stream2.m4s are both corrupt.

Can you tell us how these were generated? What tools did you use?

joeyparrish commented 5 years ago

Sorry, I see now you already mentioned that ffmpeg generated these.

joeyparrish commented 5 years ago

Looking more closely, I see what the problem is now:

$ file bug-assets/1665/media/init-stream*
bug-assets/1665/media/init-stream0.m4s: ISO Media
bug-assets/1665/media/init-stream1.m4s: WebM
bug-assets/1665/media/init-stream2.m4s: WebM
bug-assets/1665/media/init-stream3.m4s: ISO Media
bug-assets/1665/media/init-stream4.m4s: WebM
bug-assets/1665/media/init-stream5.m4s: WebM
bug-assets/1665/media/init-stream6.m4s: WebM

You can't put WebM in HLS. You also can't pretend that WebM is MP4. :-)

Lastique commented 5 years ago

I think it may be because some of the representations are webm segments.

Lastique commented 5 years ago

Is it possible to ignore the webm segments in the player and use mp4 streams? I was assuming the player would ignore the streams it can't play.

BTW, the webm segments are probably present because ffmpeg generates both DASH and HLS manifests for the common set of media segments, and DASH can contain both types of streams. Apparently, ffmpeg does not filter away webm streams from HLS streams.

joeyparrish commented 5 years ago

The player ignores any streams the browser can't play, but that's a separate step that occurs after parsing the manifest. So that code isn't even in play. The problem with your WebM in HLS occurs at the level of parsing the content, because of two things:

  1. We can't extract timestamps from WebM in HLS (yet)
  2. There is nothing in your HLS playlist to signal that it's WebM and not MP4

The fact that your WebM segments all say .m4s on the end is misleading the HLS parser into thinking they are MP4s. There isn't any MIME type in HLS to tell us otherwise.

This is not content we're going to be able to support. If you can, I would recommend that you file a bug on ffmpeg to ask them to either:

  1. Use appropriate file names for WebM
  2. or filter out WebM from the HLS output (since WebM in HLS isn't much of a thing)
  3. or both

Does this help?

Lastique commented 5 years ago

I've already submitted an ffmpeg patch to support different file name extensions (.webm and .mp4) for different segment types, but I haven't tested shaka-player with this new behavior yet. I will test it in a day or two and report back.

joeyparrish commented 5 years ago

Great, thanks!

With the file names corrected, I expect you will still run into HLS_COULD_NOT_PARSE_SEGMENT_START_TIME, but for a different reason:

    if (mimeType == 'video/mp4' || mimeType == 'audio/mp4') {
      return this.getStartTimeFromMp4Segment_(
          responses[0].data, responses[1].data);
    } else if (mimeType == 'audio/mpeg') {
      // There is no standard way to embed a timestamp in an mp3 file, so the
      // start time is presumably 0.
      return 0;
    } else if (mimeType == 'video/mp2t') {
      return this.getStartTimeFromTsSegment_(responses[0].data);
    } else if (mimeType == 'application/mp4' ||
               mimeType.startsWith('text/')) {
      return this.getStartTimeFromTextSegment_(
          mimeType, codecs, responses[0].data);
    } else {
      // TODO: Parse WebM?
      // TODO: Parse raw AAC?
      throw new shaka.util.Error(
          shaka.util.Error.Severity.CRITICAL,
          shaka.util.Error.Category.MANIFEST,
          shaka.util.Error.Code.HLS_COULD_NOT_PARSE_SEGMENT_START_TIME);
    }

We don't have a WebM timestamp parser yet, and until we solve #1558, we will continue trying to parse the first segment of each playlist. Ideally, we should be able to make certain simplifying assumptions about alignment that would allow us to get away with fetching and parsing only certain segments. We might be able to assume timestamps in WebM segments are the same as corresponding MP4 segments. But that's a separate issue tracked in #1558.

Lastique commented 5 years ago

Yes, I can confirm that with different file extensions I get HLS_COULD_NOT_PARSE_SEGMENT_START_TIME error on loading the manifest. I've attached the new media contents in case you need it for testing. I'll see if I can filter out the webm streams from HLS manifest.

Still, I think the player should just ignore the streams it can't play, including those that cannot be interpreted before testing what the browser supports. From the user standpoint, this is really the same thing.

media.tar.gz

Lastique commented 5 years ago

We might be able to assume timestamps in WebM segments are the same as corresponding MP4 segments.

I think that would be a wrong assumption to make, if I understood you correctly. You can see in my media examples that different streams use different time bases, and different segments of different media streams also have different duration.

Lastique commented 5 years ago

I've patched ffmpeg so that is only saves mp4 streams in the HLS manifest, and now shaka-player is able to play the stream. Thanks for your help!

I'll keep the bug open, though, since I still think it would be useful if shaka-player ignored the streams it can't play.

joeyparrish commented 5 years ago

@joeyparrish wrote:

We might be able to assume timestamps in WebM segments are the same as corresponding MP4 segments.

@Lastique wrote:

I think that would be a wrong assumption to make, if I understood you correctly. You can see in my media examples that different streams use different time bases, and different segments of different media streams also have different duration.

We can't make that assumption for live content, but it is a valid assumption for VOD. For VOD, we may assume that all playlists are aligned, and we only need to get the timestamp of the first segment from one playlist. Quoting https://tools.ietf.org/html/rfc8216#section-6.2.4 :

If the Playlist contains an EXT-X-PLAYLIST-TYPE tag with the value of VOD, the first segment of every Media Playlist in every Variant Stream MUST start at the same media timestamp.

If ffmpeg generates VOD content where this is not true, then it is not holding to the spec. We can't expend a lot of resources supporting non-compliant content. However, if it's if live, then that seems to be fine WRT the spec.


@Lastique wrote:

Still, I think the player should just ignore the streams it can't play, including those that cannot be interpreted before testing what the browser supports. From the user standpoint, this is really the same thing.

I think ignoring these streams is not as simple as it seems. For example, if the only audio stream is raw AAC, and we can't play it, that should still be an error, as opposed to a silent video. If there were multiple alternative codecs available, ignoring some and playing the others would make sense.

If some fundamental part of the content can't be played (no audio or no video), then an error is more appropriate. Otherwise, a developer might miss our lack of support during integration testing and be surprised by customer reports later. I would rather let developers choose another project early than be surprised and angry about this one after launch.

So if we can ignore some streams and still play others, I think you're right. We should. So I'm going to leave this open an enhancement.

That said, our HLS parser is a bit complicated at the moment, and implementing this behavior will not be easy until after we refactor. So I can't promise that it will be done soon. For now, this will go into our backlog.

Lastique commented 5 years ago

If ffmpeg generates VOD content where this is not true, then it is not holding to the spec.

I don't see EXT-X-PLAYLIST-TYPE in the HLS manifests produced by ffmpeg dash writer. There is a separate hls writer that does make use of this tag, but I'm not using it currently and I think it is optional (i.e. by default the tag will not be present as well).

In practice, it is often difficult to guarantee that multiple streams start with the same timestamp, unless you're willing to introduce a synchronization error. When you're working with encoded content, video has to start with a keyframe, which are typically sparse and not guaranteed to have the same timestamp as any of the corresponding audio frames. In order to have audio and video synchronized and start at the same timestamp you typically have to do the mastering and re-encoding the content, which is not desirable and in some contexts not practical (e.g. when we're talking of realtime media processing). So, while the spec requires VOD streams to have the same initial timestamps, I would be reluctant to rely on this and would expect that either that guarantee doesn't hold, or that EXT-X-PLAYLIST-TYPE=VOD would not be used often. In other words, you have to have a backup plan of what to do when the stream is not VOD and does not start with the same timestamps, and at that point you might as well apply the same logic to VOD.

joeyparrish commented 5 years ago

I don't see EXT-X-PLAYLIST-TYPE in the HLS manifests produced by ffmpeg dash writer.

If it's not present, then it's a live stream. It is required for VOD or EVENT type streams.

In practice, it is often difficult to guarantee that multiple streams start with the same timestamp, unless you're willing to introduce a synchronization error. When you're working with encoded content, video has to start with a keyframe, which are typically sparse and not guaranteed to have the same timestamp as any of the corresponding audio frames.

So, while the spec requires VOD streams to have the same initial timestamps, I would be reluctant to rely on this...

We have to rely on the specs. Interoperating with arbitrary encoders and packagers and browsers is infeasible otherwise. Coding against the non-compliant behavior of dozens of things outside of your control is a losing proposition.

But take another look at what this requirement means. For VOD, every segment must be present anyway. VOD and EVENT playlists may not ever remove segments once they are added. So the first segment in each media playlist is the first segment that ever existed for this content. To require that they all begin with the same timestamp is not a burdensome requirement. A reasonable person might even assume that they are all 0. (Except that last time I checked, Apple's own tools for HLS start content timestamps arbitrarily at 10 seconds for no obvious reason and with no configuration to change it. :disappointed: )

In other words, you have to have a backup plan of what to do when the stream is not VOD and does not start with the same timestamps, and at that point you might as well apply the same logic to VOD.

We do have a backup plan for what to do when the stream is not VOD. In fact, we already treat VOD and live exactly the same. But to do that, we wind up fetching segments and extracting timestamps for each media playlist. This is very high latency, and it delays startup. Our plan is to start relying on the VOD requirement in the spec to optimize VOD playback and reduce latency. Does that make more sense now?

Lastique commented 5 years ago

Yes, I understand your point, and in fact I agree with you in that following the spec is the right thing. I also understand the will to reduce latency and avoid having to download segments that you actually won't play. I'm just pointing out that the requirement of the initial timestamps being the same for all streams can be rather expensive on the content producer side, if at all feasible in some cases. It has nothing to do with segment deletion but rather with the incentive to avoid content transcoding when possible.

To illustrate this, imagine you want to publish a video as a VOD stream on your web server. Oftentimes, video files contain streams (video or audio) that start at different timestamps. This is normal and even explicitly supported in some formats, like matroska, for example. It is often utilized to achieve better synchronization between the streams. So, suppose the audio stream starts 100 ms after the video stream in the file. Basically, you have 3 options:

Of course, there is a fourth option, which is to not mark the stream as VOD in the first place. And it looks like the most reasonable one compared to the other three, except that the browser may not be working optimally, not knowing that the content is actually static and not live. I'm not sure how significant that difference would be, besides having to download the initial segments of all substreams.

We do have a backup plan for what to do when the stream is not VOD. In fact, we already treat VOD and live exactly the same.

Ok, good to know. My suggestion then, when you implement that optimization, leave an escape hatch so that users are able to still use the "live" behavior for VOD streams. Even if not enabled by default, this may be useful for cases like the one I described above.

Anyway, thank you for taking the time to write such expanded comments.

joeyparrish commented 5 years ago

suppose the audio stream starts 100 ms after the video stream in the file

This is a non-issue for the purposes of building a segment index and fetching. We are not responsible for audio/video sync in any way. We're just fetching segments, and fetching the right ones when you seek to a new location. So 100ms is fine, and we'll still fetch the right segments. In fact, DASH says that the accumulated drift between the ideal in the manifest and the actual in the segments can be up to 1/2 segment, so the Player is already built to tolerate as much as that.

So there is no need for an "escape hatch" for VOD, because this has nothing at all to do with audio sync.

Also, this is all getting way off-topic for this issue. If you would like to continue to debate our plans for optimizing VOD, please continue this on #1558.

joeyparrish commented 2 years ago

Not sure that this is relevant now that we have stopped parsing segments for timestamps. Lowering priority and removing from the "HLS Improvements" project.

avelad commented 1 year ago

Is anyone interested in seeing this supported?

shaka-bot commented 1 year ago

Closing due to inactivity. If this is still an issue for you or if you have further questions, the OP can ask shaka-bot to reopen it by including @shaka-bot reopen in a comment.