hugofloresgarcia / music-trees

Supplementary code for the experiments described in the 2021 ISMIR submission: Leveraging Hierarchical Structures for Few Shot Musical Instrument Recognition.
MIT License
40 stars 3 forks source link

"Trials did not complete", incomplete_trials #4

Open loretoparisi opened 2 years ago

loretoparisi commented 2 years ago

I was finally able to run trees.generate šŸš€

! python -m music_trees.generate \
                --dataset mdb \
                --name mdb-aug \
                --example_length 1.0 \
                --augment true \
                --hop_length 0.5 \
                --sample_rate 16000 \

  Global seed set to 42
/home/ec2-user/SageMaker/medleydb/medleydb/__init__.py:64: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  INST_TAXONOMY = yaml.load(fhandle)
/home/ec2-user/SageMaker/medleydb/medleydb/__init__.py:72: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  MIXING_COEFFICIENTS = yaml.load(fhandle)
Global seed set to 42
2021-11-22:18:47:14,891 INFO     [seed.py:54] Global seed set to 42
0it [00:00, ?it/s]

I then try

%cd /home/ec2-user/SageMaker/music-trees
%set_env CUDA_VISIBLE_DEVICES=0
! python music_trees/search.py --name height-v1

but I get this error

/home/ec2-user/SageMaker/music-trees
env: CUDA_VISIBLE_DEVICES=0
Global seed set to 42
/home/ec2-user/SageMaker/medleydb/medleydb/__init__.py:64: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  INST_TAXONOMY = yaml.load(fhandle)
/home/ec2-user/SageMaker/medleydb/medleydb/__init__.py:72: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  MIXING_COEFFICIENTS = yaml.load(fhandle)
Global seed set to 42
2021-11-22:18:52:10,882 INFO     [seed.py:54] Global seed set to 42
2021-11-22:18:52:11,34 INFO     [tune.py:748] Initializing Ray automatically.For cluster usage or custom Ray initialization, call `ray.init(...)` before `tune.run`.
2021-11-22 18:52:12,530 WARNING experiment.py:272 -- No name detected on trainable. Using DEFAULT.
2021-11-22 18:52:12,531 INFO registry.py:70 -- Detected unknown callable for trainable. Converting to class.
(pid=24191) Global seed set to 42
(pid=24191) /home/ec2-user/SageMaker/medleydb/medleydb/__init__.py:64: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
(pid=24191)   INST_TAXONOMY = yaml.load(fhandle)
(pid=24191) /home/ec2-user/SageMaker/medleydb/medleydb/__init__.py:72: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
(pid=24191)   MIXING_COEFFICIENTS = yaml.load(fhandle)
(pid=24191) Global seed set to 42
(pid=24191) 2021-11-22:18:52:16,277 INFO     [seed.py:54] Global seed set to 42
== Status ==
Current time: 2021-11-22 18:52:12 (running for 00:00:00.15)
Memory usage on this node: 4.4/15.4 GiB
Using AsyncHyperBand: num_stopped=0
Bracket: Iter 60000.000: None
Resources requested: 1.0/4 CPUs, 1.0/1 GPUs, 0.0/6.93 GiB heap, 0.0/3.47 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /home/ec2-user/SageMaker/music-trees/runs/height-v1-11.22.2021/height-v1
Number of trials: 5/5 (4 PENDING, 1 RUNNING)
+---------------------+----------+---------------------+----------+
| Trial name          | status   | loc                 |   height |
|---------------------+----------+---------------------+----------|
| DEFAULT_4daba_00000 | RUNNING  | xxxxxxxxx:24191 |        4 |
| DEFAULT_4daba_00001 | PENDING  |                     |        2 |
| DEFAULT_4daba_00002 | PENDING  |                     |        0 |
| DEFAULT_4daba_00003 | PENDING  |                     |        3 |
| DEFAULT_4daba_00004 | PENDING  |                     |        1 |
+---------------------+----------+---------------------+----------+
(ImplicitFunc pid=24191) PARSING KNOWN ARGS
(ImplicitFunc pid=24191) root
(ImplicitFunc pid=24191) ā”œā”€ā”€ aerophones
(ImplicitFunc pid=24191) ā”‚   ā”œā”€ā”€ free aerophones
(ImplicitFunc pid=24191) ā”‚   ā”‚   ā””ā”€ā”€ interruptive free aerophones
(ImplicitFunc pid=24191) ā”‚   ā”‚
....
.....
(pid=24191) /home/ec2-user/SageMaker/music-trees/music_trees/utils/data.py:140: YAMLLoadWarning:
(pid=24191) 
(pid=24191) calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
(pid=24191) 
(pid=24191) GPU available: True, used: True
(pid=24191) 2021-11-22:18:52:16,591 INFO     [distributed.py:54] GPU available: True, used: True
(pid=24191) TPU available: None, using: 0 TPU cores
(pid=24191) 2021-11-22:18:52:16,591 INFO     [distributed.py:54] TPU available: None, using: 0 TPU cores
(pid=24191) Using native 16bit precision.
(pid=24191) 2021-11-22:18:52:16,591 INFO     [accelerator_connector.py:331] Using native 16bit precision.
(pid=24191) 2021-11-22:18:52:16,594 INFO     [data.py:125] loading files
0it [00:00, ?it/s]1) 
0it [00:00, ?it/s]1) 
...
...
(pid=24191) 2021-11-22:18:52:16,615 INFO     [data.py:131] done
(pid=24191) 2021-11-22 18:52:16,616 ERROR function_runner.py:268 -- Runner Thread raised error.
(pid=24191) Traceback (most recent call last):
(pid=24191)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/tune/function_runner.py", line 262, in run
(pid=24191)     self._entrypoint()
(pid=24191)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/tune/function_runner.py", line 331, in entrypoint
(pid=24191)     self._status_reporter.get_checkpoint())
(pid=24191)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/util/tracing/tracing_helper.py", line 451, in _resume_span
(pid=24191)     return method(self, *_args, **_kwargs)
(pid=24191)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/tune/function_runner.py", line 599, in _trainable_func
(pid=24191)     output = fn()
(pid=24191)   File "music_trees/search.py", line 110, in run_trial
(pid=24191)     return mt.train.train(hparams, use_ray=True)
(pid=24191)   File "/home/ec2-user/SageMaker/music-trees/music_trees/train.py", line 113, in train
(pid=24191)     trainer.fit(task, datamodule=datamodule)
(pid=24191)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 471, in fit
(pid=24191)     self.call_setup_hook(model)
(pid=24191)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 1070, in call_setup_hook
(pid=24191)     self.datamodule.setup(stage_name)
(pid=24191)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/pytorch_lightning/core/datamodule.py", line 92, in wrapped_fn
(pid=24191)     return fn(*args, **kwargs)
(pid=24191)   File "/home/ec2-user/SageMaker/music-trees/music_trees/data.py", line 313, in setup
(pid=24191)     **self.tr_kwargs)
(pid=24191)   File "/home/ec2-user/SageMaker/music-trees/music_trees/data.py", line 90, in __init__
(pid=24191)     self.files = self._load_files(name, partition)
(pid=24191)   File "/home/ec2-user/SageMaker/music-trees/music_trees/data.py", line 144, in _load_files
(pid=24191)     assert len(new_records) > 0
(pid=24191) AssertionError
(pid=24191) Exception in thread Thread-2:
(pid=24191) Traceback (most recent call last):
(pid=24191)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/threading.py", line 916, in _bootstrap_inner
(pid=24191)     self.run()
(pid=24191)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/tune/function_runner.py", line 281, in run
(pid=24191)     raise e
(pid=24191)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/tune/function_runner.py", line 262, in run
(pid=24191)     self._entrypoint()
(pid=24191)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/tune/function_runner.py", line 331, in entrypoint
(pid=24191)     self._status_reporter.get_checkpoint())
(pid=24191)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/util/tracing/tracing_helper.py", line 451, in _resume_span
(pid=24191)     return method(self, *_args, **_kwargs)
(pid=24191)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/tune/function_runner.py", line 599, in _trainable_func
(pid=24191)     output = fn()
(pid=24191)   File "music_trees/search.py", line 110, in run_trial
(pid=24191)     return mt.train.train(hparams, use_ray=True)
(pid=24191)   File "/home/ec2-user/SageMaker/music-trees/music_trees/train.py", line 113, in train
(pid=24191)     trainer.fit(task, datamodule=datamodule)
(pid=24191)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 471, in fit
(pid=24191)     self.call_setup_hook(model)
(pid=24191)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 1070, in call_setup_hook
(pid=24191)     self.datamodule.setup(stage_name)
(ImplicitFunc pid=24191) ā”‚       ā”‚   ā””ā”€ā”€ indirectly struck idiophones_
...
(pid=24191)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/pytorch_lightning/core/datamodule.py", line 92, in wrapped_fn
(pid=24191)     return fn(*args, **kwargs)
(pid=24191)   File "/home/ec2-user/SageMaker/music-trees/music_trees/data.py", line 313, in setup
(pid=24191)     **self.tr_kwargs)
(pid=24191)   File "/home/ec2-user/SageMaker/music-trees/music_trees/data.py", line 90, in __init__
(pid=24191)     self.files = self._load_files(name, partition)
(pid=24191)   File "/home/ec2-user/SageMaker/music-trees/music_trees/data.py", line 144, in _load_files
(pid=24191)     assert len(new_records) > 0
(pid=24191) AssertionError
(pid=24191) 
2021-11-22 18:52:16,636 ERROR trial_runner.py:924 -- Trial DEFAULT_4daba_00000: Error processing event.
Traceback (most recent call last):
  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 890, in _process_trial
    results = self.trial_executor.fetch_result(trial)
  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 788, in fetch_result
    result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT)
  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
    return func(*args, **kwargs)
  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/worker.py", line 1625, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(TuneError): ray::ImplicitFunc.train_buffered() (pid=24191, ip=172.16.71.215, repr=<types.ImplicitFunc object at 0x7f9528381f98>)
  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/tune/trainable.py", line 224, in train_buffered
    result = self.train()
  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/tune/trainable.py", line 283, in train
    result = self.step()
  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/tune/function_runner.py", line 381, in step
    self._report_thread_runner_error(block=True)
  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/tune/function_runner.py", line 529, in _report_thread_runner_error
    ("Trial raised an exception. Traceback:\n{}".format(err_tb_str)
ray.tune.error.TuneError: Trial raised an exception. Traceback:
ray::ImplicitFunc.train_buffered() (pid=24191, ip=172.16.71.215, repr=<types.ImplicitFunc object at 0x7f9528381f98>)
  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/tune/function_runner.py", line 262, in run
    self._entrypoint()
  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/tune/function_runner.py", line 331, in entrypoint
    self._status_reporter.get_checkpoint())
  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/tune/function_runner.py", line 599, in _trainable_func
    output = fn()
  File "music_trees/search.py", line 110, in run_trial
    return mt.train.train(hparams, use_ray=True)
  File "/home/ec2-user/SageMaker/music-trees/music_trees/train.py", line 113, in train
    trainer.fit(task, datamodule=datamodule)
  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 471, in fit
    self.call_setup_hook(model)
  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 1070, in call_setup_hook
    self.datamodule.setup(stage_name)
  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/pytorch_lightning/core/datamodule.py", line 92, in wrapped_fn
    return fn(*args, **kwargs)
  File "/home/ec2-user/SageMaker/music-trees/music_trees/data.py", line 313, in setup
    **self.tr_kwargs)
  File "/home/ec2-user/SageMaker/music-trees/music_trees/data.py", line 90, in __init__
    self.files = self._load_files(name, partition)
  File "/home/ec2-user/SageMaker/music-trees/music_trees/data.py", line 144, in _load_files
    assert len(new_records) > 0
AssertionError
Result for DEFAULT_4daba_00000:
  date: 2021-11-22_18-52-16
  experiment_id: 61b50089859345f8883ff565d7daa318
  hostname: ip-xxxxxxx
  node_ip: xxxxxxx
  pid: 24191
  timestamp: 1637607136
  trial_id: 4daba_00000

(pid=24192) Global seed set to 42
(pid=24192) /home/ec2-user/SageMaker/medleydb/medleydb/__init__.py:64: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
(pid=24192)   INST_TAXONOMY = yaml.load(fhandle)
(pid=24192) /home/ec2-user/SageMaker/medleydb/medleydb/__init__.py:72: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
(pid=24192)   MIXING_COEFFICIENTS = yaml.load(fhandle)
(pid=24192) Global seed set to 42
(pid=24192) 2021-11-22:18:52:20,187 INFO     [seed.py:54] Global seed set to 42
== Status ==
Current time: 2021-11-22 18:52:17 (running for 00:00:05.19)
Memory usage on this node: 4.5/15.4 GiB
Using AsyncHyperBand: num_stopped=0
Bracket: Iter 60000.000: None
Resources requested: 1.0/4 CPUs, 1.0/1 GPUs, 0.0/6.93 GiB heap, 0.0/3.47 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /home/ec2-user/SageMaker/music-trees/runs/height-v1-11.22.2021/height-v1
Number of trials: 5/5 (1 ERROR, 3 PENDING, 1 RUNNING)
...

That is too much difficult to read and debug, the last part was

0it [00:00, ?it/s]1) 
(pid=24421) 2021-11-22:18:52:32,688 INFO     [data.py:131] done
(pid=24421) 2021-11-22 18:52:32,688 ERROR function_runner.py:268 -- Runner Thread raised error.
(pid=24421) Traceback (most recent call last):
(pid=24421)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/tune/function_runner.py", line 262, in run
(pid=24421)     self._entrypoint()
(pid=24421)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/tune/function_runner.py", line 331, in entrypoint
(pid=24421)     self._status_reporter.get_checkpoint())
(pid=24421)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/util/tracing/tracing_helper.py", line 451, in _resume_span
(pid=24421)     return method(self, *_args, **_kwargs)
(pid=24421)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/tune/function_runner.py", line 599, in _trainable_func
(pid=24421)     output = fn()
(pid=24421)   File "music_trees/search.py", line 110, in run_trial
(pid=24421)     return mt.train.train(hparams, use_ray=True)
(pid=24421)   File "/home/ec2-user/SageMaker/music-trees/music_trees/train.py", line 113, in train
(pid=24421)     trainer.fit(task, datamodule=datamodule)
(pid=24421)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 471, in fit
(pid=24421)     self.call_setup_hook(model)
(pid=24421)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 1070, in call_setup_hook
(pid=24421)     self.datamodule.setup(stage_name)
(pid=24421)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/pytorch_lightning/core/datamodule.py", line 92, in wrapped_fn
(pid=24421)     return fn(*args, **kwargs)
(pid=24421)   File "/home/ec2-user/SageMaker/music-trees/music_trees/data.py", line 313, in setup
(pid=24421)     **self.tr_kwargs)
(pid=24421)   File "/home/ec2-user/SageMaker/music-trees/music_trees/data.py", line 90, in __init__
(pid=24421)     self.files = self._load_files(name, partition)
(pid=24421)   File "/home/ec2-user/SageMaker/music-trees/music_trees/data.py", line 144, in _load_files
(pid=24421)     assert len(new_records) > 0
(pid=24421) AssertionError
(pid=24421) Exception in thread Thread-2:
(pid=24421) Traceback (most recent call last):
(pid=24421)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/threading.py", line 916, in _bootstrap_inner
(pid=24421)     self.run()
(pid=24421)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/tune/function_runner.py", line 281, in run
(pid=24421)     raise e
(pid=24421)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/tune/function_runner.py", line 262, in run
(pid=24421)     self._entrypoint()
(pid=24421)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/tune/function_runner.py", line 331, in entrypoint
(pid=24421)     self._status_reporter.get_checkpoint())
(pid=24421)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/util/tracing/tracing_helper.py", line 451, in _resume_span
(pid=24421)     return method(self, *_args, **_kwargs)
(pid=24421)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/tune/function_runner.py", line 599, in _trainable_func
(pid=24421)     output = fn()
(pid=24421)   File "music_trees/search.py", line 110, in run_trial
(pid=24421)     return mt.train.train(hparams, use_ray=True)
(pid=24421)   File "/home/ec2-user/SageMaker/music-trees/music_trees/train.py", line 113, in train
(pid=24421)     trainer.fit(task, datamodule=datamodule)
(pid=24421)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 471, in fit
(pid=24421)     self.call_setup_hook(model)
(pid=24421)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 1070, in call_setup_hook
(pid=24421)     self.datamodule.setup(stage_name)
(pid=24421)   File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/pytorch_lightning/core/datamodule.py", line 92, in wrapped_fn
(pid=24421)     return fn(*args, **kwargs)
(pid=24421)   File "/home/ec2-user/SageMaker/music-trees/music_trees/data.py", line 313, in setup
(pid=24421)     **self.tr_kwargs)
(pid=24421)   File "/home/ec2-user/SageMaker/music-trees/music_trees/data.py", line 90, in __init__
(pid=24421)     self.files = self._load_files(name, partition)
(pid=24421)   File "/home/ec2-user/SageMaker/music-trees/music_trees/data.py", line 144, in _load_files
(pid=24421)     assert len(new_records) > 0
(pid=24421) AssertionError
(pid=24421) 
Traceback (most recent call last):
  File "music_trees/search.py", line 161, in <module>
    run_experiment(exp)
  File "music_trees/search.py", line 134, in run_experiment
    progress_reporter=reporter)
  File "/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/ray/tune/tune.py", line 624, in run
    raise TuneError("Trials did not complete", incomplete_trials)
ray.tune.error.TuneError: ('Trials did not complete', [DEFAULT_4daba_00000, DEFAULT_4daba_00001, DEFAULT_4daba_00002, DEFAULT_4daba_00003, DEFAULT_4daba_00004])

Any idea?

Thank you!

loretoparisi commented 2 years ago

[UPDATE] I think I'm facing this ray issue.

loretoparisi commented 2 years ago

Hello @hugofloresgarcia, so according to the author of Ray, that was so kind to help me while debugging and reading the logs emitted by Ray, the error behind this issue is actually related to this line:

it seems like there is something wrong with the records in files.items() (a few lines above), such that the loop does not add any entries to the new_recods list, ultimately raising the AssertionError.

Thanks!

hugofloresgarcia commented 2 years ago

Hi Loreto,

Apologies for the late response. I'm currently out of town, and wouldn't be able to look into the issue until tomorrow (Dec 2). Apologies if this causes any inconvenience!

hugofloresgarcia commented 2 years ago

Giving it a quick look, it looks like the data loading code cannot find the .json files associated with preprocessed data entries. See this line to look at where the data is supposed to be getting loaded (but isn't): https://github.com/hugofloresgarcia/music-trees/blob/e6c547b2d33a901b29917287c368a94d84f42ef0/music_trees/data.py#L126

hugofloresgarcia commented 2 years ago

Instead of trying to debug with the quite verbose ray output, try running a single training script first:

python music_trees/train.py --model_name hprotonet --height 4 --d_root 128 --loss_alpha 1 --name "flat (BCE)" --dataset mdb-aug --learning_rate 0.03 --loss_weight_fn cross-entropy

this should make the debug output a lot more readable