justinsalamon / scaper

A library for soundscape synthesis and augmentation
BSD 3-Clause "New" or "Revised" License
380 stars 56 forks source link

Unable to generate long-duration soundscapes #68

Open ohadbarak opened 4 years ago

ohadbarak commented 4 years ago

Hello, When trying to generate a long-duration soundscape (1 hour), I get the error printed at the end of this message.

It seems that a .wav file is created for each background and foreground event, but the size of each one of these files is equal to the total size of the final file (in my case, 330MB for a 1 hour long .wav file). As I understand, this is an artefact of using SoX. Scaper must pad every foreground event to the duration of the full soundscape prior to mixing them all together.

Since there are many events being generated, I eventually run out of disk space (the default location for the temporary wav files is /tmp), and the process crashes (It is possible to specify the directory for the temporary files with the TMPDIR environment variable, but that's not a practical solution if there's insufficient disk space anywhere in the system).

There is probably no quick solution other than generating short soundscapes and then concatenating them externally.

Thank you, Ohad


Error read as follows:

Traceback (most recent call last): File "py/gen-monophonic.py", line 115, in main (len(sys.argv), sys.argv) File "py/gen-monophonic.py", line 109, in main txt_path=text_file) File "/usr/lib/python2.7/site-packages/scaper/core.py", line 1707, in generate disable_sox_warnings=disable_sox_warnings) File "/usr/lib/python2.7/site-packages/scaper/core.py", line 1575, in _generate_audio tmpfiles_internal[-1].name) File "/usr/lib/python2.7/site-packages/scaper/audio.py", line 102, in get_integrated_lufs loudness_stats = r128stats(concat_file.name) File "/usr/lib/python2.7/site-packages/scaper/audio.py", line 52, in r128stats filepath, e.str())) scaper.scaper_exceptions.ScaperError: Unable to obtain LUFS data for /tmp/tmpMYbHQg.wav, error message: Unable to find LUFS summary, stats string: ffmpeg version git-2018-11-01-6a034ad Copyright (c) 2000-2018 the FFmpeg developers built with gcc 4.8.5 (GCC) 20150623 (Red Hat 4.8.5-11) configuration: --prefix=/usr/local --extra-cflags=-I/usr/local/include --extra-ldflags=-L/usr/local/lib --bindir=/usr/local/bin --extra-libs=-ldl --enable-gpl --enable-nonfree --enable-libfdk_aac --enable-libmp3lame --enable-libvpx --enable-libfreetype --enable-libspeex libavutil 56. 21.100 / 56. 21.100 libavcodec 58. 34.100 / 58. 34.100 libavformat 58. 19.102 / 58. 19.102 libavdevice 58. 4.107 / 58. 4.107 libavfilter 7. 39.100 / 7. 39.100 libswscale 5. 2.100 / 5. 2.100 libswresample 3. 2.100 / 3. 2.100 libpostproc 55. 2.100 / 55. 2.100 /tmp/tmpMYbHQg.wav: Invalid data found when processing input

justinsalamon commented 4 years ago

Correct. In the future I might move away from using SoX and instead do all the audio editing in-memory, which should be faster and more space efficient.

For now you could change the TMPDIR environment variable in your system (which will determine where all the temp audio data gets saved by scaper while processing) to a location with more space (e.g. an external drive) - not ideal, but might work as a temporary solution.

Another option would be to split the process into e.g. 6 soundscapes of 10 minutes each and concatenate the audio and annotations post-hoc. This shouldn't be too hard to do - the audio can just be concatenated directly (using SoX or just plain python), and the annotations just need a fixed time constant added to each annotation (0 min, 10 min, 20 min..) prior to concatenating them e.g. via pandas. It's a bit of extra work, but if changing the location of your temp dir doesn't work for you for any reason, it's your best bet right now.