openai / jukebox

Code for the paper "Jukebox: A Generative Model for Music"
https://openai.com/blog/jukebox/
Other
7.78k stars 1.4k forks source link

OSError: sndfile library not found #91

Open danicuki opened 4 years ago

danicuki commented 4 years ago

Trying to run on a docker container

$docker run -i -t continuumio/miniconda /bin/bash

After installing, I get this error

(jukebox) root@182b585df72d:/jukebox# python jukebox/sample.py --model=5b_lyrics --name=sample_5b --levels=3 --sample_length_in_seconds=20 \
> --total_sample_length_in_seconds=180 --sr=44100 --n_samples=6 --hop_fraction=0.5,0.5,0.125
Traceback (most recent call last):
  File "jukebox/sample.py", line 7, in <module>
    from jukebox.utils.audio_utils import save_wav, load_audio
  File "/jukebox/jukebox/utils/audio_utils.py", line 4, in <module>
    import soundfile
  File "/opt/conda/envs/jukebox/lib/python3.7/site-packages/soundfile.py", line 142, in <module>
    raise OSError('sndfile library not found')
OSError: sndfile library not found
johndpope commented 4 years ago

that's the wrong command - it's loading this https://hub.docker.com/r/continuumio/miniconda/dockerfile

Try starting with this Dockerfile specific to jukebox https://github.com/btrude/jukebox-docker

this line https://github.com/johndpope/jukebox-docker/blob/master/Dockerfile#L198 should import soundfile

N.b - check nvidia-smi on your host for your cuda version - it should match with this import statement in dockerfile - you may need to bump cuda:10.2 - cuda:10.0 (sidenote - nvidia also have cudagl docker images / not applicable here) https://hub.docker.com/r/nvidia/cuda/ FROM nvidia/cuda:10.1-devel-ubuntu18.04

danicuki commented 4 years ago

Thanks for the help. Now I've got this error:

Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx
root@af4016dfb45c:/opt/jukebox# exit

How do I run my docker host with NVIDIA on a Mac?

btrude commented 4 years ago

Thanks for the help. Now I've got this error:

Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx
root@af4016dfb45c:/opt/jukebox# exit

How do I run my docker host with NVIDIA on a Mac?

You can't, it is currently only supported on linux

danicuki commented 4 years ago

Thanks for the feedback!

Is there any way to make this project not locked in on NVIDIA dependencies?

On Tue, May 26, 2020 at 8:35 AM btrude notifications@github.com wrote:

Thanks for the help. Now I've got this error:

Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver fromhttp://www.nvidia.com/Download/index.aspx root@af4016dfb45c:/opt/jukebox# exit

How do I run my docker host with NVIDIA on a Mac?

You can't, it is currently only supported on linux

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/openai/jukebox/issues/91#issuecomment-633970322, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAS54GHDT5QIIRSMGANIITRTOSQHANCNFSM4NJYNR2A .

-- D

perlman-izzy commented 4 years ago

I'm a noob but I was able to get rid of that error by conda install -c conda-forge libsndfile . Although I think that's supposed to be covered in one of the install libraries somewhere so it could be a red flag you didn't install libraries properly. That's what happened to me.

perlman-izzy commented 4 years ago

I can get the program to run for like 2 minutes and then I get the error below. Anybody have any suggestions? Running on a vast.ai server.

py", line 581, in _load

deserialized_objects[key]._set_from_file(f, offset, f_should_read_directly)

RuntimeError: unexpected EOF, expected 20664312 more bytes. The file might be corrupted.

terminate called after throwing an instance of 'c10::Error'

what(): owning_ptr == NullType::singleton() || owningptr->refcount.load() > 0 ASSERT FAILED at /opt/conda/conda-bld/pytorch_1556653114079/work/c10/util/intrusive_ptr.h:350, please report a bug to PyTorch. intrusive_ptr: Can only intrusive_ptr::reclaim() owning pointers that were created using intrusive_ptr::release(). (reclaim at /opt/conda/conda-bld/pytorch_1556653114079/work/c10/util/intrusive_ptr.h:350)

frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x45 (0x7fc5a1c8ddc5 in /opt/conda/envs/jukebox/lib/python3.7/site-packages/torch/lib/libc10.so)

frame #1: THStorage_free + 0xca (0x7fc5a29d120a in /opt/conda/envs/jukebox/lib/python3.7/site-packages/torch/lib/libcaffe2.so)

frame #2: + 0x14872d (0x7fc5d0cb272d in /opt/conda/envs/jukebox/lib/python3.7/site-packages/torch/lib/libtorch_python.so)

frame #26: __libc_start_main + 0xf0 (0x7fc5df535830 in /lib/x86_64-linux-gnu/libc.so.6) Aborted (core dumped)
btrude commented 4 years ago

I can get the program to run for like 2 minutes and then I get the error below. Anybody have any suggestions? Running on a vast.ai server.

py", line 581, in _load

deserialized_objects[key]._set_from_file(f, offset, f_should_read_directly)

RuntimeError: unexpected EOF, expected 20664312 more bytes. The file might be corrupted.

terminate called after throwing an instance of 'c10::Error'

what(): owning_ptr == NullType::singleton() || owningptr->refcount.load() > 0 ASSERT FAILED at /opt/conda/conda-bld/pytorch_1556653114079/work/c10/util/intrusive_ptr.h:350, please report a bug to PyTorch. intrusive_ptr: Can only intrusive_ptr::reclaim() owning pointers that were created using intrusive_ptr::release(). (reclaim at /opt/conda/conda-bld/pytorch_1556653114079/work/c10/util/intrusive_ptr.h:350)

frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x45 (0x7fc5a1c8ddc5 in /opt/conda/envs/jukebox/lib/python3.7/site-packages/torch/lib/libc10.so)

frame #1: THStorage_free + 0xca (0x7fc5a29d120a in /opt/conda/envs/jukebox/lib/python3.7/site-packages/torch/lib/libcaffe2.so)

frame #2: + 0x14872d (0x7fc5d0cb272d in /opt/conda/envs/jukebox/lib/python3.7/site-packages/torch/lib/libtorch_python.so)

frame #26: __libc_start_main + 0xf0 (0x7fc5df535830 in /lib/x86_64-linux-gnu/libc.so.6)

Aborted (core dumped)

Most likely one of the audio files you transferred to vast was corrupted, failed to transfer fully, or maybe you started training before it had fully transferred all the files from the directory. My vast servers have been egregiously slow this weekend so I was on support yesterday and they told me to just spin up a bunch of servers, determine which one doesn't have extremely slow network speeds and then destroy all the other ones (I have just built a pc specifically for ml at home so my vast days are behind me now thankfully and I'm definitely rethinking my recommendation given their slowness). I'll also note that if you are looking to train with your own music it is mostly pointless to do anything other than finetune the 1b model with your own genre/artist tag replacing existing one(s). Even with a local gpu with 24gb of vram I do not have enough memory to finetune or train from scratch at the depth of the 5b models, and training the small priors/vqvae results in significantly worse quality than just finetuning the 1b. I was able to get uncanny results finetuning with 1.5 hours of my own music on 1x tesla m40 for only 8 hours (but I am currently in the process of continuing that training so I would expect better results with even more training and properly annealing the training rate etc).

Jekyll233 commented 1 year ago

尝试在 docker 容器上运行

$docker run -i -t continuumio/miniconda /bin/bash

安装后,我收到此错误

(jukebox) root@182b585df72d:/jukebox# python jukebox/sample.py --model=5b_lyrics --name=sample_5b --levels=3 --sample_length_in_seconds=20 \
> --total_sample_length_in_seconds=180 --sr=44100 --n_samples=6 --hop_fraction=0.5,0.5,0.125
Traceback (most recent call last):
  File "jukebox/sample.py", line 7, in <module>
    from jukebox.utils.audio_utils import save_wav, load_audio
  File "/jukebox/jukebox/utils/audio_utils.py", line 4, in <module>
    import soundfile
  File "/opt/conda/envs/jukebox/lib/python3.7/site-packages/soundfile.py", line 142, in <module>
    raise OSError('sndfile library not found')
OSError: sndfile library not found

How was it solved?

btrude commented 1 year ago

尝试在 docker 容器上运行 $docker run -i -t continuumio/miniconda /bin/bash 安装后,我收到此错误

(jukebox) root@182b585df72d:/jukebox# python jukebox/sample.py --model=5b_lyrics --name=sample_5b --levels=3 --sample_length_in_seconds=20 \
> --total_sample_length_in_seconds=180 --sr=44100 --n_samples=6 --hop_fraction=0.5,0.5,0.125
Traceback (most recent call last):
  File "jukebox/sample.py", line 7, in <module>
    from jukebox.utils.audio_utils import save_wav, load_audio
  File "/jukebox/jukebox/utils/audio_utils.py", line 4, in <module>
    import soundfile
  File "/opt/conda/envs/jukebox/lib/python3.7/site-packages/soundfile.py", line 142, in <module>
    raise OSError('sndfile library not found')
OSError: sndfile library not found

How was it solved?

apt-get update && apt-get install -y libsndfile1