radek-k / FFMediaToolkit

FFMediaToolkit is a cross-platform video decoder/encoder library for .NET that uses FFmpeg native libraries. It supports video frames extraction, reading stream metadata and creating videos from bitmaps in any format supported by FFmpeg.
MIT License
352 stars 56 forks source link

Improved AV Reading/Writing #57

Closed IsaMorphic closed 3 years ago

IsaMorphic commented 3 years ago

This pull request aims to get FFMediaToolkit a few steps closer to a solution for #33.

A couple months ago @kskalski started an implementation of audio stream reading (#48). However, they did leave a few notes about key missing features of the implementation. The most prevalent of these were proper handling of interleaved audio/video data and write support for audio streams.

After some brief deliberation and research about the ffmpeg api, I am ready to begin an implementation of both of these features.

One other feature I will also consider adding is support for multiple audio streams, as I believe it is pretty limiting to allow for either one video stream, one audio stream, or both, but not more than one each.

I will post updates to this thread as I make progress.

IsaMorphic commented 3 years ago

Just committed some more work on the implementation. Audio and video streams now share a base "MediaStream" class which allows for less copying and pasting of code between the two.

I have tested the new changes by creating an application that converts a given "test.mp4" that contains one video and one audio stream into a PNG sequence accompanied by a WAV file with all the audio.

It does so by reading from both data streams simultaneously, essentially reading video frames until some audio packets are buffered, and then consuming the buffered data exhaustively. It repeats this process until both streams have no more packets.

Next steps include adapting the MediaFile class to allow for accessing multiple audio and video streams, and of course writing an arbitrary number of audio and video streams to an output file.

IsaMorphic commented 3 years ago

Hello, thanks for checking the code out! I will resolve all of these conversations once I have completed the requested changes. Cheers!

IsaMorphic commented 3 years ago

I have just finished writing a decent multi-stream audio/video writing implementation for this library. There are some goofy quirks that arise when trying to transcode audio streams with different frame sizes, but frankly that is beyond the scope of this library, and its nothing that a clever user wouldn't be able to abstract out for themselves.

I also realized that audio streams do not always give/receive their data in the form of floats, so I added the proper conversion code so that this ugliness is hidden from the user of the library.

By and by, the code now fits my needs and is ready to be merged into the main repository whenever a maintainer sees it fit.
Cheers!