Open everydaydigital opened 3 years ago
Hello there, did you manage to get it running? If yes how? I am getting the same issue.
Hello there,
did you manage to get it running? If yes how? I am getting the same issue.
Hey, I haven't been able to progress past this error yet - but will make sure to update here if I do.
Good to hear that's it's not just me with this issue though, so there's still a chance that someone with a bit more experience will be able to help us out.
Please use colab. An rtx 3070, 3080 and 3090 aren't designed to run this, especially on windows this is an issue.
Please use colab. An rtx 3070, 3080 and 3090 aren't designed to run this, especially on windows this is an issue.
Hey Randy, I have been using Colab to train VQ-VAE but I'm still getting the same error. Any idea what the issue might be?
Please use colab. An rtx 3070, 3080 and 3090 aren't designed to run this, especially on windows this is an issue.
Thanks for the reply - Are you able to confirm in more detail about how this specific issue is being caused by an RTX 30xx series GPU?
This error appears to claim that the master port is being sent a null value. Do you think that maybe this could be coming from a change within the GPU architecture?
I've managed to get this running on another Windows PC with an RTX 2070 with 6GB RAM and while it doesn't make it all the to the final upsampled stage (it displays a clear error that it's run out of RAM), it still manages to run through a few files and generate some new audio output along the way.
I can accept that more RAM is required for a successful run, but for my particular usage and interest it would be great to lock down the source of this issue.
If every github issue can be solved with 'just use colab' the open source nature of these project would never progress very far.
By sharing our bug reports with the community, there is the opportunity for a greater all-round understanding for everyone and ideally the issue can be resolved together!
Cheers
Hi there, I'm trying to get OpenAi Jukebox running on Windows 10 with an NVIDIA RTX 3070 8GB GPU (after finally getting it installed/compiled by placing wget into my system32 folder using the nightly build of torch along with using git bash to install TensorboardX).
It has yet to run successfully though, and now there's a weird error when running the script:
Traceback (most recent call last): File "jukebox/sample.py", line 279, in <module> fire.Fire(run) File "D:\OpenAIJukebox\envs\lib\site-packages\fire\core.py", line 127, in Fire component_trace = _Fire(component, args, context, name) File "D:\OpenAIJukebox\envs\lib\site-packages\fire\core.py", line 366, in _Fire component, remaining_args) File "D:\OpenAIJukebox\envs\lib\site-packages\fire\core.py", line 542, in _CallCallable result = fn(*varargs, **kwargs) File "jukebox/sample.py", line 271, in run rank, local_rank, device = setup_dist_from_mpi(port=port) File "d:\openaijukebox\jukebox\jukebox\utils\dist_utils.py", line 46, in setup_dist_from_mpi return _setup_dist_from_mpi(master_addr, backend, port, n_attempts, verbose) File "d:\openaijukebox\jukebox\jukebox\utils\dist_utils.py", line 86, in _setup_dist_from_mpi dist.init_process_group(backend=backend, init_method=f"env://") File "d:\openaijukebox\jukebox\jukebox\utils\dist_adapter.py", line 61, in init_process_group return _init_process_group(backend, init_method) File "d:\openaijukebox\jukebox\jukebox\utils\dist_adapter.py", line 86, in _init_process_group return dist.init_process_group(backend, init_method) File "D:\OpenAIJukebox\envs\lib\site-packages\torch\distributed\distributed_c10d.py", line 500, in init_process_group store, rank, world_size = next(rendezvous_iterator) File "D:\OpenAIJukebox\envs\lib\site-packages\torch\distributed\rendezvous.py", line 186, in _env_rendezvous_handler master_port = int(master_port) **ValueError: invalid literal for int() with base 10: '\\'**
My clean install process is:
Open miniconda terminal
conda config --add pkgs_dirs D:.pkgs D: cd D:\OpenAIJukebox conda create --prefix ./envs python=3.7.5 -y conda activate ./envs conda install mpi4py=3.0.3 -y conda update -n base -c defaults conda -y pip3 install numpy pip3 install --pre torch torchvision torchaudio -f https://download.pytorch.org/whl/nightly/cu110/torch_nightly.html git clone https://github.com/openai/jukebox.git cd jukebox pip3 install -r requirements.txt pip3 install -e . conda install av=7.0.01 -c conda-forge -y
Close miniconda
open new git bash terminal from D:\OpenAIJukebox
conda activate ./envs cd ./jukebox pip install ./tensorboardX
close git bash terminal
open Miniconda terminal
D: cd D:\OpenAIJukebox\jukebox conda activate ./envs python jukebox/sample.py --model=1b_lyrics --name=sample_5b_prompted --levels=3 --mode=primed \ --audio_file=D:\OpenAIJukebox\jukebox\prompts\home.wav,D:\OpenAIJukebox\jukebox\prompts\Parallax.wav,D:\OpenAIJukebox\jukebox\prompts\Skankin.wav,D:\OpenAIJukebox\jukebox\prompts\Vitals.wav --prompt_length_in_seconds=6 \ --sample_length_in_seconds=10 --total_sample_length_in_seconds=60 --sr=44100 --n_samples=1 --hop_fraction=0.5,0.5,0.125
RTX 30XX cards use the Ampere architecture, which requires CUDA 11.x - so I've had to modify some of the original install instructions to build with the latest version of torch and the cudatoolkit & I'm wondering if that might be part of the issue? I have read other reports that people have this working on 3090's though, so not really sure where I might have gone wrong.
Any advice appreciated - I'd really love to be able to get this running on my system or at least understand where the issue is coming from. Cheers,
It works on windows 8.1 right..
I have the same issue. iam using rtx 3060 12gb on ubuntu 22.04 . Any update please :(((
Please use colab. An rtx 3070, 3080 and 3090 aren't designed to run this, especially on windows this is an issue.
Thanks for the reply - Are you able to confirm in more detail about how this specific issue is being caused by an RTX 30xx series GPU?
This error appears to claim that the master port is being sent a null value. Do you think that maybe this could be coming from a change within the GPU architecture?
I've managed to get this running on another Windows PC with an RTX 2070 with 6GB RAM and while it doesn't make it all the to the final upsampled stage (it displays a clear error that it's run out of RAM), it still manages to run through a few files and generate some new audio output along the way.
I can accept that more RAM is required for a successful run, but for my particular usage and interest it would be great to lock down the source of this issue.
If every github issue can be solved with 'just use colab' the open source nature of these project would never progress very far.
By sharing our bug reports with the community, there is the opportunity for a greater all-round understanding for everyone and ideally the issue can be resolved together!
Cheers
have you found the solution ?
For me the issue was that I copied the recommended command line arguments from the readme which contains newlines escaped with backslashes. Removing these when using an actual command line fixed the issue.
Hi there, I'm trying to get OpenAi Jukebox running on Windows 10 with an NVIDIA RTX 3070 8GB GPU (after finally getting it installed/compiled by placing wget into my system32 folder using the nightly build of torch along with using git bash to install TensorboardX).
It has yet to run successfully though, and now there's a weird error when running the script:
My clean install process is:
RTX 30XX cards use the Ampere architecture, which requires CUDA 11.x - so I've had to modify some of the original install instructions to build with the latest version of torch and the cudatoolkit & I'm wondering if that might be part of the issue? I have read other reports that people have this working on 3090's though, so not really sure where I might have gone wrong.
Any advice appreciated - I'd really love to be able to get this running on my system or at least understand where the issue is coming from. Cheers,