beetbox / audioread

cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python
MIT License
481 stars 108 forks source link

added format check for method read_data in rawread #70

Open Charliechen1 opened 5 years ago

Charliechen1 commented 5 years ago

The python audioop.lin2lin will complain if the length of data can not be divided by old_width, and it's not that convenient to check the length of the audio before using the model, especially when a large batches of audio files are used in some machine learning tasks. Therefore, I have made some patch for the input audio data if the length is not to the satisfaction. Thank you for taking my suggestion into consideration, and the project is truly intensive for me. :+1:

Charliechen1 commented 5 years ago

Here for your reference: I print the data and get: b'\xff\xff\xff\xff\xfe\xff\xfc\xff\xfc\xff\xfc' And I figured out that it's due to a broken download. Therefore, would it be better to raise a warning under this circumstance?

sampsyo commented 5 years ago

Hmm; perhaps! But on the other hand, another reasonable (silent) fix might be to round down instead of up—that is, to drop the last (partial) sample if it exists. Would that make sense to you?

Charliechen1 commented 5 years ago

It should works.

sampsyo commented 5 years ago

OK, great! Want to give it a try and see if it works on the file you have?

Charliechen1 commented 5 years ago

Sure~