sile / hls_m3u8

HLS(RFC8216) m3u8 parser/generator
Apache License 2.0
49 stars 16 forks source link

Parse playlists #73

Open sagudev opened 2 years ago

sagudev commented 2 years ago

In the wild m3u(8) files exist also as playlists. The major difference between playlist and HLS is #EXT-X-TARGETDURATION as playlist does not contain them. Here is example of playlist file from Wikipedia:

#EXTM3U

#EXTINF:123, Sample artist - Sample title
C:\Documents and Settings\I\My Music\Sample.mp3

#EXTINF:321,Example Artist - Example title
C:\Documents and Settings\I\My Music\Greatest Hits\Example.ogg

Currently the only change that is needed this to work is adding #EXT-X-TARGETDURATION:500000000 on second line of file. And the file can be parsed and regenerated. So I propose setting #EXT-X-TARGETDURATION as optional, if it exist we can assume it is HLS file, else it is playlist file.

For fine polishing when playlist is determined, all HLS M3U extensions would be disabled, so they cannot be accidentally parsed or generated.

sile commented 2 years ago

Thank you for sharing your idea.

(I'm not sure but) it sounds that it's not so difficult to support wild m3u(8) files. However, as the name suggests, this crate is for HLS m3u8 files. To keep the implementation/design simple and reduce maintenance costs, I think it's better not to support wild m3u(8) in this crate. (Besides, there are crates for m3u and m3u8. I have not used those though)

BTW, @Luro02 is the current leading developer of this crate. So if he likes the direction, I'm willing to adopt that.

Luro02 commented 2 years ago

It has been such a long time since I went through the m3u8 spec. We had a number of issues in the past that were caused by me misinterpreting the spec.

The EXT-X-TARGETDURATION tag is REQUIRED.

https://datatracker.ietf.org/doc/html/rfc8216#section-4.3.3.1

Seems to suggest that this field must always be present?


Trying to support any possible malformed input, seems to be a bad idea, because there are infinitely many ways to create broken files and people should not rely on it.

I think having a way to completely disable verification seems to be a useful feature to have, because sometimes you have malformed files and it is not that relevant if the file is valid.

The current crate design makes it difficult to implement disabling the verification completely.

For example like you said the target duration field is currently not an Option. If one were to make it optional, every user of this crate would have to add unwrap to every use of the field, which looks kinda bad and introduces unnecessary panic handling for something that could never happen.

The solution for this is to take advantage of the way this file-format is designed (streaming line based file format, I would say?) and introduce some kind of iterator based parser.

Here is an example of how this could look like (this is a markdown parser): https://docs.rs/pulldown-cmark

The benefit of this approach is that it does not need allocations and can be implemented in such a way that one can easily disable verifying everything or just a few things.

The foundation for this parsing is already present here: https://github.com/sile/hls_m3u8/blob/master/src/line.rs

Sadly I am currently very busy and have no time to implement this myself.

sile commented 2 years ago

Thanks for your comment!

The solution for this is to ... introduce some kind of iterator based parser.

Sounds interesting. Unfortunately, I also don't have time to try to implement it.