SonyCSLParis / pesto

Self-supervised learning for fast pitch estimation
GNU Lesser General Public License v3.0
168 stars 15 forks source link

split inputs into chunks to avoid CUDA OOM #9

Closed aRI0U closed 9 months ago

aRI0U commented 9 months ago

CQT frames are currently processed all in parallel but when dealing with big batches (or long audio files) this can generate a CUDA OOM. In this case, inputs should be split into chunks to automatically optimize parallelism while preventing OOM.

On a 1080, the limit is of about 9 minutes in parallel for a 16 kHz audio signal