larsimmisch / pyalsaaudio

ALSA wrappers for Python
http://larsimmisch.github.io/pyalsaaudio/
Other
216 stars 74 forks source link

Potential pull request for pre-allocated buffer read? #152

Open ckemere opened 4 months ago

ckemere commented 4 months ago

I use read() in a tight loop where I write microphone data to a file using python-soundfile. I've found it helpful to create a version of the read function that takes a preallocated numpy buffer, so I can write code like this:

import soundfile
import numpy as np

# Open alsa device
adevice = alsaaudio.PCM(alsaaudio.PCM_CAPTURE, alsaaudio.PCM_NORMAL, device=device)

self.adevice.setchannels(num_channels) 
self.adevice.setrate(fs)
scale_factor = 4 # system fixed number of buffers
if dtype == 'int16':
    adevice.setformat(alsaaudio.PCM_FORMAT_S16_LE)
    scale_factor = scale_factor*2
else:
    raise(ValueError("dtypes other than 'int16' not currently supported."))

adevice.setperiodsize(self.buffer_size)
in_buf = np.zeros((buffer_size*scale_factor, channels), dtype=dtype, order='C')

with soundfile.SoundFile(soundfilename, 'w', fs, channels, 'PCM_16') as soundfile:
    while True:
        nsamp = adevice.read_into(in_buf)
        self.soundfile.write(in_buf[:nsamp,:])
        if nsamp < buffer_size:
            print('ALSA Read buffer underrun.')

The two caveats about the proposedread_into() are (1) dependency on numpy and (2) I only have working Python 3 code.

larsimmisch commented 4 months ago

Only Python 3 would be fine, but a default compile-time dependency on numpy would be problematic.

Do you see a way to disentangle this? Would it be possible to make this optional at compile-time?

ossilator commented 4 months ago

i wonder whether a hard dep on numpy would be really a problem. what actual uses outside the scientific community (which uses numpy throughout) does pyaa have?

conversely, i somewhat doubt that buffer pre-alloc actually buys you a lot in this case, as even the tightest loop is (sample-)rate-limited.

anyway, one could do it without numpy while being numpy-compatible by using the buffer protocol directly. that would make the usage somewhat uglier, though.

ckemere commented 4 months ago

Thanks for commenting back so quickly! I agree that the memory allocation is probably not a huge drain on resource. I think I can also potentially trigger garbage collection on the input buffer. My naive implementation just using read() currently also involves copying the data from the input buffer into a numpy array and reshaping it, along with error handling if it's the wrong size. So I actually end up effectively doubling the memory allocation/garbage collection load. Any thoughts on that?

I looked briefly at the buffer protocol documentation, and haven't fully understood it yet. I'm not really a python expert, and so there are things that I don't fully understand how to do. It seems like I should be able to effectively cast the bytes returned by read() into the memory area of a numpy array, but I can't figure out how to do that...

ckemere commented 4 months ago

Ah, actually, reading more, it seems that np.frombuffer creates a view rather than a copy, so perhaps it's less of an issue.

ossilator commented 4 months ago

yes, my own code uses frombuffer() and reshape()'s it according to sample size and channel count.

RonaldAJ commented 4 months ago

Yes, it can be done with the struct module:

data = np.array(struct.unpack(conversion_string, rawdata), dtype=self.dtype)

data: will be the numpy array conversion string:

conversion_string = f"{conversion_dict['endianness']}{noofsamples}{conversion_dict['formatchar']}"

with some example conversion dicts:

alsaaudio.PCM_FORMAT_S8: {'dtype': np.int8, 'endianness': '', 'formatchar': 'h', 'bytewidth': 1},
alsaaudio.PCM_FORMAT_U8: {'dtype': np.uint8, 'endianness': '', 'formatchar': 'h', 'bytewidth': 1},
alsaaudio.PCM_FORMAT_S16_LE: {'dtype': np.int16, 'endianness': '<', 'formatchar': 'h', 'bytewidth': 2},
alsaaudio.PCM_FORMAT_S16_BE: {'dtype': np.int16, 'endianness': '>', 'formatchar': 'h', 'bytewidth': 2},
alsaaudio.PCM_FORMAT_U16_LE: {'dtype': np.uint16, 'endianness': '<', 'formatchar': 'H', 'bytewidth': 2},
alsaaudio.PCM_FORMAT_U16_BE: {'dtype': np.uint16, 'endianness': '>', 'formatchar': 'H', 'bytewidth': 2},
alsaaudio.PCM_FORMAT_S32_LE: {'dtype': np.int32, 'endianness': '<', 'formatchar': 'l', 'bytewidth': 4},
alsaaudio.PCM_FORMAT_S32_BE: {'dtype': np.int32, 'endianness': '>', 'formatchar': 'l', 'bytewidth': 4},
alsaaudio.PCM_FORMAT_U32_LE: {'dtype': np.uint32, 'endianness': '<', 'formatchar': 'L', 'bytewidth': 4},
alsaaudio.PCM_FORMAT_U32_BE: {'dtype': np.uint32, 'endianness': '>', 'formatchar': 'L', 'bytewidth': 4},

and noofsamples is the number of samples in the rawdata.

rawdata: bytes string

Not sure how many copies are produced in the process. But memory usage can be kept reasonably low by keeping the number of samples low.

ckemere commented 4 months ago

np.array() will copy, though, right?

RonaldAJ commented 4 months ago

np.array() will copy, though, right?

I think so, but you can perhaps use frombuffer there. I would have to look into it to be sure.

I think all these are for a single microphone channel.