TeamPyOgg / PyOgg

Simple OGG Vorbis, Opus and FLAC bindings for Python
The Unlicense
63 stars 27 forks source link

FLAC: Word size not necessarily correct #79

Open mattgwwalker opened 3 years ago

mattgwwalker commented 3 years ago

In FlacFile and FlacFileStream, bytes_per_sample is fixed at 2 (16-bits). This is not the only possibility. Indeed, FLAC files appear to have possible word sizes of 8, 12, 16, 20, or 24 bits.

Given FLAC files can have non-integer-byte word sizes, AudioFile may need to store bits rather than bytes_per_sample.

mattgwwalker commented 3 years ago

Audacity is able to generate only 16 or 24-bit word sizes when saving FLAC files. That is: 2 or 3 byte word sizes.

FFmpeg is able to generate only 8, 16, and 24-bit word sizes when saving FLAC files. That is: 1, 2, or 3 byte word sizes.

I could not find a way to download or generate FLAC files with word sizes that were not integer multiples of whole-bytes. As a consequence, I cannot create tests for 12-bit or 20-bit FLAC files.

PyOgg would most likely correctly load such files, but without examples that cannot be tested. Accessing the instance's buffer would probably give valid data. However, converting it to a NumPy array would most likely be a very inefficient process, and without a way to obtain examples, the processing pipeline would be untested. I think it would be best to raise an exception in the as_array() method if it detects non-whole-byte word sizes.

Further, NumPy does not have a 24-bit int format. If as_array() is called on such files, they will have to be converted to 32-bit ints to be useful in Python. Potential solutions exist on StackOverflow.

One possibility to generate the desired examples would be to add a FLAC encoder into PyOgg that was able to generate all word sizes including 12-bit and 20-bit FLAC files.

Conclusion: Until the desired examples are obtained, PyOgg should be made to read such FLAC files, but raise an exception when asked to convert the buffer to a NumPy array.