Open XaryLee opened 7 months ago
I get "RuntimeError: cuDNN error: CUDNN_STATUS_MAPPING_ERROR" running your fork on 4070ti with cuda version 12.2.0. What could be going on?
@lov3allmy Hi, do you use docker to run the code?
@XaryLee I haven't changed the code, so the versions are the same as they are written in the Dockerfile. I just launched the container by running this code:
cd ./docker
docker build -t sheetsage .
ROOT=https://raw.githubusercontent.com/chrisdonahue/sheetsage/main; wget $ROOT/prepare.sh && wget $ROOT/sheetsage.sh && chmod +x *.sh && ./prepare.sh
@lov3allmy Could you please provide more information about the error? I guess the context might help. In addition, I haven't test prepare.sh
with the updated dockerfile, so I am uncertain about the script's compatibility with the new environment for seamless execution. Maybe you can attempt to run sheetsage.sh
directly.
I think I missed one error that occurred during the execution of the "docker build -t sheetsage ." command. I'll figure it out and try to run the script, then I'll tell you here if it worked
I started processing, but now CUDA is signaling a lack of video memory. Although my graphics card has 12 GB. Unfortunately, I'm a backend developer and I don't understand ML and used Python only for school needs earlier. Are there any tricks that will allow me to solve this problem?
INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0
INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes.
0%| | 0/8 [00:02<?, ?it/s]
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/sheetsage/sheetsage/infer.py", line 837, in <module>
lead_sheet, segment_beats, segment_beats_times = sheetsage(
File "/sheetsage/sheetsage/infer.py", line 680, in sheetsage
chunks_features = _extract_features(
File "/sheetsage/sheetsage/infer.py", line 367, in _extract_features
fr, feats = extractor(audio_path, offset=offset, duration=duration)
File "/sheetsage/sheetsage/representations/jukebox.py", line 234, in __call__
activations = self.lm_activations(
File "/sheetsage/sheetsage/representations/jukebox.py", line 202, in lm_activations
x_cond, y_cond, _ = self.lm.get_cond(None, self.lm.get_y(labels[-1], 0))
File "/usr/local/lib/python3.10/dist-packages/jukebox/prior/prior.py", line 241, in get_cond
y_cond, y_pos = self.y_emb(y) if self.y_cond else (None, None)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/jukebox/prior/conditioners.py", line 153, in forward
pos_emb = self.total_length_emb(total_length) + self.absolute_pos_emb(start, end) + self.relative_pos_emb(start/total_length, end/total_length)
RuntimeError: CUDA out of memory. Tried to allocate 1.17 GiB (GPU 0; 11.71 GiB total capacity; 9.49 GiB already allocated; 47.25 MiB free; 9.61 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Summary
This pull request updates the project's Dockerfile, associated shell scripts, and some parts of the codebase to ensure compatibility with the latest versions of the libraries used and to add support for recent GPU architectures.
Changes
Testing
Benefits
Notes
Thank you for considering this pull request. I'm looking forward to your feedback and any further suggestions for improvement.