Open schittli opened 2 years ago
Ha, good question!
cp is a bytestring - so each index holds just one byte even though for most audio an individual sample is more than one byte long. In other words each sample in 16-bit audio, is two bytes and you have to read it from two indexes, like cp[2:4]
and then unpack it into a numeric value with the struct
module
Hello @jiaaro
thank you very much for your answer!, it's great when suddenly everything becomes clear.
It is very interesting that you have chosen a bytestring as the basic structure in pydub, because I have noticed that the pydub source code is always very lean and easy to understand if it processes the audio data - if one knows that there is a bytestring 😃
Could it be that the advantage of the bytestring is that the source code is simpler because Python provides great functions for processing this data structure.
And that the disadvantage is that the CPU has to work more because it has to use indexed byte accesses all the time instead of being able to work "with one stream of Int's" per audio channel?
I ask because I'm wondering if it's worth writing code to change the structure to e.g. an int array if the audio signal has to go through many complex calculation steps afterward.
Thanks a lot, kind regards, Thomas
Regarding the data type being a bytestring, I am having problems with the following code:
rate, wav_data = wavfile.read(str(wavpath))
audio = AudioSegment(data=wav_data, frame_rate=rate, sample_width=2, channels=1)
(I know there's a from_file
method of AudioSegment
but this is part of a legacy code, so I load wav using wavfile
)
audio.raw_data
turns out to be a numpy.ndarray
(Array of int16
). This seems to cause some problems down the way in certain operations.
For example, I create white noise, which turns out to have raw_data
type as bytes
:
wnoise = WhiteNoise(sample_rate=audio.frame_rate).to_audio_segment(duration=audio.duration_seconds*1000, volume=-40)
When I try to overlay it on the above audio, I get an error about length differences. I debugged and checked that the lengths are the same, up until the call to audioop.add, and they are the same. But - probably due to sample type differences - audioop thinks they are different sizes.
au_wn = audio.overlay(wnoise)
Another problem I saw is, seg._data
and seg[0:]._data
has different lengths in my test, a difference of 4 bytes. Might be something to be aware of.
Hello
thank you very much for sharing your great work!, it surprises me how smart problems can be solved in Python.
I don't know if it's a bug or an understanding question and I ask because it looks strange:
The context
pyaudioop.py calculates the number of samples like this:
The function doesn't document it, but it looks like
size
defines how many bytes a sample uses (usually 1, 2, or 4, I guess) because PyDub passessample_width
to it.The question...
What confuses me is that Python seems to have a very flexible / smart
int
and that it doesn't matter if one stores one, two or more bytes in anint
.Therefore I would expect that
_sample_count
is simply the result oflen(cp)
.Why is the array length divided by the number of bytes of a sample in
_sample_count
?Thanks a lot for any light 😃 , kind regards, Thomas