openai / jukebox

Code for the paper "Jukebox: A Generative Model for Music"
https://openai.com/blog/jukebox/
Other
7.78k stars 1.4k forks source link

Training Issue - AssertionError: Midpoint 42164118 of item beyond total length 38873664 #59

Open whatdamath opened 4 years ago

whatdamath commented 4 years ago

I'm trying to train a new model using the provided instructions but no matter how many wav files I put in or what length they are I always get the error below:

Traceback (most recent call last): File "jukebox/train.py", line 336, in fire.Fire(run) File "/home/anton/anaconda3/envs/jukebox/lib/python3.7/site-packages/fire/core.py", line 127, in Fire component_trace = _Fire(component, args, context, name) File "/home/anton/anaconda3/envs/jukebox/lib/python3.7/site-packages/fire/core.py", line 366, in _Fire component, remaining_args) File "/home/anton/anaconda3/envs/jukebox/lib/python3.7/site-packages/fire/core.py", line 542, in _CallCallable result = fn(*varargs, **kwargs) File "jukebox/train.py", line 294, in run data_processor = DataProcessor(hps) File "/home/anton/Documents/deep/jukebox/jukebox/data/data_processor.py", line 28, in init hps.bandwidth = calculate_bandwidth(self.dataset, hps, duration=duration) File "/home/anton/Documents/deep/jukebox/jukebox/utils/audio_utils.py", line 28, in calculate_bandwidth x = dataset[idx] File "/home/anton/Documents/deep/jukebox/jukebox/data/files_dataset.py", line 96, in getitem return self.get_item(item) File "/home/anton/Documents/deep/jukebox/jukebox/data/files_dataset.py", line 89, in get_item index, offset = self.get_index_offset(item) File "/home/anton/Documents/deep/jukebox/jukebox/data/files_dataset.py", line 60, in get_index_offset assert 0 <= midpoint < self.cumsum[-1], f'Midpoint {midpoint} of item beyond total length {self.cumsum[-1]}' AssertionError: Midpoint 42164118 of item beyond total length 38873664

Running this on a 11GB 1080Ti and had a few CUDA errors before, but with pretrained samples it eventually worked. Here though, it seems to be an error with midpoint being calculated wrong, but despite reading through the code, I can't figure out what's wrong

apeguero1 commented 4 years ago

Hmm... seems like calculate_bandwidth is trying to access an item of music that's larger than the length of FilesAudioDataset. Maybe try adding a condition to the while loop to prevent that? Something like while n_seen < n_samples and idx < len(dataset): right above line 28 in audio_utils.py

whatdamath commented 4 years ago

Ah perfect, this worked. Although I also had to lower the sample_length to 131072 - it looks like 11GB is not enough to run this Thanks!

On Mon, May 11, 2020 at 1:03 AM DJ AI notifications@github.com wrote:

Hmm... seems like calculate_bandwidth is trying to access an item of music that's larger than the length of FilesAudioDataset. Maybe try adding a condition to the while loop to prevent that? Something like while n_seen < n_samples and idx < len(dataset): right above line 28 in audio_utils.py

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/openai/jukebox/issues/59#issuecomment-626349949, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIBN3EIVIIQ6BHT2K632WMDRQ3F35ANCNFSM4M5HLNBA .

apeguero1 commented 4 years ago

Nice! no problem.

perlman-izzy commented 4 years ago

Hi, pretty much a noob here: does anybody know how to fix:

"file "jukebox/sample.py", line 279, in fire.Fire(run) File "/usr/local/lib/python3.7/site-packages/fire/core.py", line 127, in Fire component_trace = _Fire(component, args, context, name) File "/usr/local/lib/python3.7/site-packages/fire/core.py", line 366, in _Fire component, remaining_args) File "/usr/local/lib/python3.7/site-packages/fire/core.py", line 542, in _CallCallable result = fn(varargs, *kwargs) File "jukebox/sample.py", line 276, in run save_samples(model, device, hps, sample_hps) File "jukebox/sample.py", line 258, in save_samples assert sample_hps.audio_file is not None AssertionError "