google / ExoPlayer

This project is deprecated and stale. The latest ExoPlayer code is available in https://github.com/androidx/media
https://developer.android.com/media/media3/exoplayer
Apache License 2.0
21.71k stars 6.02k forks source link

How do I implement my own accurate SeekMap/Seeker for Mp3Extractor? #7693

Closed ryanheise closed 4 years ago

ryanheise commented 4 years ago

Searched documentation and issues

In another issue (https://github.com/google/ExoPlayer/issues/328#issuecomment-168532148) @ojw28 described one possible solution for accurate MP3 seeking:

Currently, ExoPlayer implements seeking with XING headers by interpolating between the nearest two integer-percentage positions, which is non-exact if the stream bitrate varies significantly between those points. An alternative might be to request data from the integer-percentage position that precedes the requested position and then drop media up to the exact position requested, however that would be inefficient for very long pieces of media where a single percentage of the stream might be quite large, so it's unclear whether that's actually good idea. We could consider doing that, but it's a separate discussion. Please open a new issue if you think it's a discussion that's worth having.

Question

@ojw28 's solution above looks interesting, but for my use case I will be dealing with arbitrary MP3 podcasts which are typically longish, so sample accurate seeking this way could end up being very slow. Transcoding on device is also too slow (particularly the encoding part).

Instead I would like to create and then cache my own seek map by scanning the entire file. It would at least be faster than transcoding. May I ask for some pointers on how would hook this into ExoPlayer? Thanks.

tonihei commented 4 years ago

We recently added a flag FLAG_ENABLE_INDEX_SEEKING to the Mp3Extractor (on the dev-v2 branch). This already builds the accurate seek map while the stream is read, but we wouldn't recommend it for longer files. @kim-vde may be able to provide further details or guidance.

ryanheise commented 4 years ago

Awesome! This is such incredibly lucky timing. I will indeed be dealing with long files, but that's fine. For my use case, it is much more acceptable to do this once in pre-processing (while the podcast is being downloaded) and then caching it, rather than having a long delay on every seek during playback.

So I guess that first I'll need to hack IndexSeeker or Mp3Extractor to write this out to a file, and provide a way to load it back in from a file. Then second, I'll need to figure out how to get the stream to be fully read without actually playing the audio (I'm looking at Mp3ExtractorTest for ideas).

In #328 it was mentioned that an MP4 container can store an exact index, so I was thinking about maybe cheaply wrapping the downloaded MP3 in an MP4 container along with this generated seek map, so that I wouldn't need to add any special code to ExoPlayer to read back in this seek map. But I suspect that the MP4 container doesn't actually store this index and instead its the AAC audio format that stores it, in which case this idea wouldn't work.

kim-vde commented 4 years ago

Is the bitrate constant? If yes, the problem becomes much simpler and you can use the FLAG_ENABLE_CONSTANT_BITRATE_SEEKING flag.

Then second, I'll need to figure out how to get the stream to be fully read without actually playing the audio (I'm looking at Mp3ExtractorTest for ideas).

To read the stream without playing it, you can take inspiration from DownloadHelper#MediaPreparer, which reads the stream from a MediaSource until the media is prepared. A simpler solution would be to use an actual player but this is more costly.

In #328 it was mentioned that an MP4 container can store an exact index, so I was thinking about maybe cheaply wrapping the downloaded MP3 in an MP4 container along with this generated seek map, so that I wouldn't need to add any special code to ExoPlayer to read back in this seek map. But I suspect that the MP4 container doesn't actually store this index and instead its the AAC audio format that stores it, in which case this idea wouldn't work.

You could store the seek map in an MP3 supported metadata format (MLLT frame, VBRI header or Xing header). Transcoding your file to an MP4 would also work as the stbl atom contains a time-to-sample mapping but it not trivial.

ryanheise commented 4 years ago

Thanks for the helpful answers!

Is the bitrate constant? If yes, the problem becomes much simpler and you can use the FLAG_ENABLE_CONSTANT_BITRATE_SEEKING flag.

The user will choose an arbitrary podcast outside of my control, so I'd need it to be detected.

Then second, I'll need to figure out how to get the stream to be fully read without actually playing the audio (I'm looking at Mp3ExtractorTest for ideas).

To read the stream without playing it, you can take inspiration from DownloadHelper#MediaPreparer, which reads the stream from a MediaSource until the media is prepared. A simpler solution would be to use an actual player but this is more costly.

Thanks, I'll look into this. If I am only using this strategy for MP3 files, would there be any issue with doing something like Mp3ExtractorTest.mp3SampleWithIndexSeeker?

In #328 it was mentioned that an MP4 container can store an exact index, so I was thinking about maybe cheaply wrapping the downloaded MP3 in an MP4 container along with this generated seek map, so that I wouldn't need to add any special code to ExoPlayer to read back in this seek map. But I suspect that the MP4 container doesn't actually store this index and instead its the AAC audio format that stores it, in which case this idea wouldn't work.

You could store the seek map in an MP3 supported metadata format (MLLT frame, VBRI header or Xing header).

328 mentions that the Xing header can only store 100 seek map entries which would not be accurate enough on long audio files. The others weren't mentioned but after looking into MLLT it seems it doesn't have this limitation. Interesting, that may work!

Transcoding your file to an MP4 would also work as the stbl atom contains a time-to-sample mapping but it not trivial.

I agree, I have been using this approach until now, and it does work, but it's far too slow.

kim-vde commented 4 years ago

If I am only using this strategy for MP3 files, would there be any issue with doing something like Mp3ExtractorTest.mp3SampleWithIndexSeeker

This should also work. I would have a look into ProgressiveMediaPeriod#load for a lower level logic such as the one used in the tests.