archinetai / audio-diffusion-pytorch-trainer

Trainer for audio-diffusion-pytorch
MIT License
128 stars 22 forks source link

Error executing job with overrides: ['exp=base_test'] #11

Open manman25 opened 2 years ago

manman25 commented 2 years ago

Hello! I tried to use " python train.py exp=base_test " in Eshell, but some errors were found as shown follows:

image

jbmaxwell commented 1 year ago

Yeah, I'm having the same issue. I can see that datamodule isn't in config.yaml, so I guess it's just broken... Not a huge deal to write a new training script.

manman25 commented 1 year ago

Yeah, I'm having the same issue. I can see that datamodule isn't in config.yaml, so I guess it's just broken... Not a huge deal to write a new training script.

Thanks, so you suppose that I need to write a new training script with datamodule?

gauravkuppa commented 1 year ago

I am facing this issue right now too! Can you please provide some pointers on what needs to be changed? The config yaml file or train.py?

flavioschneider commented 1 year ago

I updated the base_test.yaml it should work now. Make sure to have the latest versions installed (pip install audio-diffusion-pytorch audio-encoders-pytorch -U)

jbmaxwell commented 1 year ago

Okay, running now. Thanks!

I'm curious; since I'm not familiar with Hydra or OmegaConf (yet) I'm not really clear what I'm training on when running base_test.yaml. Is this training on my files at DIR_DATA in my .env, or it is something from YouTube (just looking at the contents of base_test.datamodule).

Any clarification appreciated.

flavioschneider commented 1 year ago

It downloads a single song from YouTube as a test, you can see the url in the base_test.yaml config.

jbmaxwell commented 1 year ago

Okay, I assumed it must download something, but does it only use that song during this test training? Mostly I'm confused about the step of adding my data path to .env if it's just going to train on one YouTube song.

What would I change to use my own data (i.e., the path I specified in .env)? I see that base_small.yaml references WAVDataset... is that what I'd use to point to DIR_DATA?

UPDATE: I've been looking at train.py, but I see that I can probably get a bit further with the notebook...

manman25 commented 1 year ago

I updated the base_test.yaml it should work now. Make sure to have the latest versions installed (pip install audio-diffusion-pytorch audio-encoders-pytorch -U) image

Hello. I have tried to run this script again, but our computer server () is in china where the net was forbidden to browse or download information from Youtube. ( From the pic you can see the details that shell shows me “DownloadError” ) so what do I can to change the data path to another url to download the song? (i.e., the path I specified in .env or base_test.yaml)?

mbuckler commented 1 year ago

I updated the base_test.yaml it should work now. Make sure to have the latest versions installed (pip install audio-diffusion-pytorch audio-encoders-pytorch -U)

Thank you for the attempted fix! Unfortunately the current commit (d5f6870624ae38e25704cb3df3f148507cf8cf51) still experiences this error. I've put the full error below in text form so that it can be more easily searched by other people with the same issue:

Expand for full error (venv7) quoththeraver@ip-26-0-139-220:~/workspace/audio-diffusion-pytorch-trainer$ python train.py exp=base_test [2023-01-12 20:35:59,628][main.utils][INFO] - Disabling python warnings! Global seed set to 12345 [2023-01-12 20:35:59,637][__main__][INFO] - Instantiating datamodule . Error executing job with overrides: ['exp=base_test'] Traceback (most recent call last): File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/_internal/utils.py", line 644, in _locate obj = getattr(obj, part) AttributeError: module 'main' has no attribute 'module_base' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/_internal/utils.py", line 650, in _locate obj = import_module(mod) File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1014, in _gcd_import File "", line 991, in _find_and_load File "", line 975, in _find_and_load_unlocked File "", line 671, in _load_unlocked File "", line 848, in exec_module File "", line 219, in _call_with_frames_removed File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/main/module_base.py", line 3, in import librosa File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/librosa/__init__.py", line 209, in from . import core File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/librosa/core/__init__.py", line 5, in from .convert import * # pylint: disable=wildcard-import File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/librosa/core/convert.py", line 7, in from . import notation File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/librosa/core/notation.py", line 8, in from ..util.exceptions import ParameterError File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/librosa/util/__init__.py", line 77, in from .utils import * # pylint: disable=wildcard-import File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/librosa/util/utils.py", line 9, in import numba File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/numba/__init__.py", line 42, in from numba.np.ufunc import (vectorize, guvectorize, threading_layer, File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/numba/np/ufunc/__init__.py", line 3, in from numba.np.ufunc.decorators import Vectorize, GUVectorize, vectorize, guvectorize File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/numba/np/ufunc/decorators.py", line 3, in from numba.np.ufunc import _internal SystemError: initialization of _internal failed without raising an exception The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 134, in _resolve_target target = _locate(target) File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/_internal/utils.py", line 658, in _locate raise ImportError( ImportError: Error loading 'main.module_base.Datamodule': SystemError('initialization of _internal failed without raising an exception') The above exception was the direct cause of the following exception: Traceback (most recent call last): File "train.py", line 110, in main() File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/main.py", line 90, in decorated_main _run_hydra( File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/_internal/utils.py", line 394, in _run_hydra _run_app( File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/_internal/utils.py", line 457, in _run_app run_and_report( File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/_internal/utils.py", line 222, in run_and_report raise ex File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/_internal/utils.py", line 219, in run_and_report return func() File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/_internal/utils.py", line 458, in lambda: hydra.run( File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 132, in run _ = ret.return_value File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/core/utils.py", line 260, in return_value raise self._return_value File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/core/utils.py", line 186, in run_job ret.return_value = task_function(task_cfg) File "train.py", line 25, in main datamodule = hydra.utils.instantiate(config.datamodule, _convert_="partial") File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 226, in instantiate return instantiate_node( File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 333, in instantiate_node _target_ = _resolve_target(node.get(_Keys.TARGET), full_key) File "/admin/home-quoththeraver/workspace/audio-diffusion-pytorch-trainer/venv7/lib/python3.8/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 139, in _resolve_target raise InstantiationException(msg) from e hydra.errors.InstantiationException: Error locating target 'main.module_base.Datamodule', set env var HYDRA_FULL_ERROR=1 to see chained exception. full_key: datamodule

Specifically, this happens when following the install and run instructions in the readme (with a fresh virtual environment). Interestingly, I was able to successfully run the experiment on my local machine which already had plenty of packages installed, so there is clearly something wrong with the requirements. When I install the packages below (from my local machine's full pip freeze) I am able to run the experiment on the remote machine.

requirements_FULL.txt

I'd submit a PR with this updated requirements.txt, but obviously that would be pretty messy since not all of these packages are required. In case the issue was simply versioning, I did make a version of the repo's requirments.txt with explicit versions based on what I had in my local environment (see below) but unfortunately that also failed which means that there are some missing packages.

requirements with versions (fails).txt

Because I am now currently unblocked I think I'll end my debugging here, but I wanted to share this info with you guys so that future people are unblocked and maybe someone will go through each package to see where the problem is :P

EmilianPostolache commented 1 year ago

When this error manifested to me, it was related to the fact that datamodule in the hydra config was a child of exp and not a root key. As such it had to be accessed as config.exp.datamodule and not config.datamodule, and the same for all keys under exp. The problem was that # @package _global_ was missing at the beggining of the yaml configuration! So if you delete that by mistake this error happens!