ModuleNotFoundError: No module named 'nam'

ServusJon commented 1 year ago

I added my soruce and my target file to "bin/train/outputs/BHPFriedmanBE04 " and ran the following:

(nam) jonathanarnold@MBPvonJonathan NAM conda environment % python bin/train/main.py \
bin/train/inputs/config_data.json \
bin/train/inputs/config_model.json \
bin/train/inputs/config_learning.json \
bin/train/outputs/BHPFriedmanBE04 
Traceback (most recent call last):
  File "/Users/jonathanarnold/Desktop/_dev/NAM conda environment/bin/train/main.py", line 19, in <module>
    from nam.data import ConcatDataset, ParametricDataset, Split, init_dataset
ModuleNotFoundError: No module named 'nam'

What am I missing?

2-dor commented 1 year ago

I added my soruce and my target file to "bin/train/outputs/BHPFriedmanBE04 " and ran the following:

(nam) jonathanarnold@MBPvonJonathan NAM conda environment % python bin/train/main.py \
bin/train/inputs/config_data.json \
bin/train/inputs/config_model.json \
bin/train/inputs/config_learning.json \
bin/train/outputs/BHPFriedmanBE04 
Traceback (most recent call last):
  File "/Users/jonathanarnold/Desktop/_dev/NAM conda environment/bin/train/main.py", line 19, in <module>
    from nam.data import ConcatDataset, ParametricDataset, Split, init_dataset
ModuleNotFoundError: No module named 'nam'

What am I missing?

If you do a "pip install -e ." and re-attempt does that work?

ServusJon commented 1 year ago

That helped. But I am stuck here @2-dor

(nam) jonathanarnold@MBPvonJonathan NAM conda environment % python bin/train/main.py \ bin/train/inputs/config_data.json \ bin/train/inputs/config_model.json \ bin/train/inputs/config_learning.json \ bin/train/outputs/BHPFriedmanBE04 Traceback (most recent call last): File "/Users/jonathanarnold/Desktop/_dev/NAM conda environment/bin/train/main.py", line 191, in main(parser.parse_args()) File "/Users/jonathanarnold/Desktop/_dev/NAM conda environment/bin/train/main.py", line 125, in main with open(args.data_config_path, "r") as fp: FileNotFoundError: [Errno 2] No such file or directory: 'bin/train/inputs/config_data.json'

2-dor commented 1 year ago

That helped. But I am stuck here @2-dor

(nam) jonathanarnold@MBPvonJonathan NAM conda environment % python bin/train/main.py bin/train/inputs/config_data.json bin/train/inputs/config_model.json bin/train/inputs/config_learning.json bin/train/outputs/BHPFriedmanBE04 Traceback (most recent call last): File "/Users/jonathanarnold/Desktop/_dev/NAM conda environment/bin/train/main.py", line 191, in main(parser.parse_args()) File "/Users/jonathanarnold/Desktop/_dev/NAM conda environment/bin/train/main.py", line 125, in main with open(args.data_config_path, "r") as fp: FileNotFoundError: [Errno 2] No such file or directory: 'bin/train/inputs/config_data.json'

I don't know to be honest. You could try running the "jupyter notebooks" as pointed out in the facebook group:

From an Anaconda command prompt type "conda install jupyter"
Type the "conda install jupyter notebook" line & wait for the install to finish
In Anaconda, navigate to your "neural-amp-modeler" folder and type "jupyter notebook"
It will launch a browser session similar in UI to the GitHub page
Copy the dry training signal "v1_1_1.wav" and wet signal "output.wav" to the "neural-amp-modeler\bin\train" location
Back in your browser with the Jupyter notebook, navigate to the same path "neural-amp-modeler\bin\train" and you'll see the "easy_colab.ipynb" file. Click on it and a page similar to what you have seen in the Google Colab cloud will launch
You can simply click on the code blocks and then on the "Run" button located on the header of the page
Note, In Step 3 (Train) the tensorboard graph won't load the checkpoints form the "lightning_logs" file; that's easily corrected by editing the first line to what's highlighted in yellow:

ServusJon commented 1 year ago

That helped a lot! But now I am not sure what to do regarding the 4 different audio files. I only have the "output.wav" and the "v1_1_1.wav".

How does your config look like? @2-dor

2-dor commented 1 year ago

You can just run the "easy" version - you only need the "v1_1_1.wav" and "output.wav" files for that.

ServusJon commented 1 year ago

I don't have this folder structure like you. :(

ServusJon commented 1 year ago

2-dor commented 1 year ago

Oh - have you cloned the current Git repository?

ServusJon commented 1 year ago

Had the wrong branch checked out haha:

Starting training. Let's rock!

MisconfigurationException Traceback (most recent call last) Cell In[2], line 2 1 get_ipython().run_line_magic('tensorboard', '--logdir /content/lightning_logs') ----> 2 run( 3 epochs=100, 4 architecture="standard" # standard, lite, feather 5 )

File ~/opt/anaconda3/envs/nam/lib/python3.10/site-packages/nam/train/colab.py:363, in run(epochs, delay, architecture, lr, lr_decay, seed) 360 train_dataloader = DataLoader(dataset_train, learning_config["train_dataloader"]) 361 val_dataloader = DataLoader(dataset_validation, learning_config["val_dataloader"]) --> 363 trainer = pl.Trainer( 364 callbacks=[ 365 pl.callbacks.model_checkpoint.ModelCheckpoint( 366 filename="checkpointbest{epoch:04d}{step}{ESR:.4f}_{MSE:.3e}", 367 save_top_k=3, 368 monitor="val_loss", 369 every_n_epochs=1, 370 ), 371 pl.callbacks.model_checkpoint.ModelCheckpoint( 372 filename="checkpointlast{epoch:04d}_{step}", every_n_epochs=1 373 ), 374 ], 375 **learning_config["trainer"], 376 ) 377 trainer.fit(model, train_dataloader, val_dataloader) 379 # Go to best checkpoint

File ~/opt/anaconda3/envs/nam/lib/python3.10/site-packages/pytorch_lightning/utilities/argparse.py:348, in _defaults_from_env_vars..insert_env_defaults(self, *args, kwargs) 345 kwargs = dict(list(env_variables.items()) + list(kwargs.items())) 347 # all args were already moved to kwargs --> 348 return fn(self, kwargs)

File ~/opt/anaconda3/envs/nam/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py:420, in Trainer.init(self, logger, enable_checkpointing, callbacks, default_root_dir, gradient_clip_val, gradient_clip_algorithm, num_nodes, num_processes, devices, gpus, auto_select_gpus, tpu_cores, ipus, enable_progress_bar, overfit_batches, track_grad_norm, check_val_every_n_epoch, fast_dev_run, accumulate_grad_batches, max_epochs, min_epochs, max_steps, min_steps, max_time, limit_train_batches, limit_val_batches, limit_test_batches, limit_predict_batches, val_check_interval, log_every_n_steps, accelerator, strategy, sync_batchnorm, precision, enable_model_summary, num_sanity_val_steps, resume_from_checkpoint, profiler, benchmark, deterministic, reload_dataloaders_every_n_epochs, auto_lr_find, replace_sampler_ddp, detect_anomaly, auto_scale_batch_size, plugins, amp_backend, amp_level, move_metrics_to_cpu, multiple_trainloader_mode, inference_mode) 417 # init connectors 418 self._data_connector = DataConnector(self, multiple_trainloader_mode) --> 420 self._accelerator_connector = AcceleratorConnector( 421 num_processes=num_processes, 422 devices=devices, 423 tpu_cores=tpu_cores, 424 ipus=ipus, 425 accelerator=accelerator, 426 strategy=strategy, 427 gpus=gpus, 428 num_nodes=num_nodes, 429 sync_batchnorm=sync_batchnorm, 430 benchmark=benchmark, 431 replace_sampler_ddp=replace_sampler_ddp, 432 deterministic=deterministic, 433 auto_select_gpus=auto_select_gpus, 434 precision=precision, 435 amp_type=amp_backend, 436 amp_level=amp_level, 437 plugins=plugins, 438 ) 439 self._logger_connector = LoggerConnector(self) 440 self._callback_connector = CallbackConnector(self)

File ~/opt/anaconda3/envs/nam/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:202, in AcceleratorConnector.init(self, devices, num_nodes, accelerator, strategy, plugins, precision, amp_type, amp_level, sync_batchnorm, benchmark, replace_sampler_ddp, deterministic, auto_select_gpus, num_processes, tpu_cores, ipus, gpus) 200 self._accelerator_flag = self._choose_auto_accelerator() 201 elif self._accelerator_flag == "gpu": --> 202 self._accelerator_flag = self._choose_gpu_accelerator_backend() 204 self._set_parallel_devices_and_init_accelerator() 206 # 3. Instantiate ClusterEnvironment

File ~/opt/anaconda3/envs/nam/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:537, in AcceleratorConnector._choose_gpu_accelerator_backend() 534 if CUDAAccelerator.is_available(): 535 return "cuda" --> 537 raise MisconfigurationException("No supported gpu backend found!")

MisconfigurationException: No supported gpu backend found!

ServusJon commented 1 year ago

I am on a mac m1. Not sure how I can change to CPU

2-dor commented 1 year ago

I am on a mac m1. Not sure how I can change to CPU

Try this "conda install pytorch torchvision torchaudio cpuonly -c pytorch" and re-run

ServusJon commented 1 year ago

I did. Same error :(

Starting training. Let's rock!

MisconfigurationException Traceback (most recent call last) Cell In[2], line 2 1 get_ipython().run_line_magic('tensorboard', '--logdir /content/lightning_logs') ----> 2 run( 3 epochs=100, 4 architecture="standard" # standard, lite, feather 5 )

File ~/opt/anaconda3/envs/nam/lib/python3.10/site-packages/nam/train/colab.py:363, in run(epochs, delay, architecture, lr, lr_decay, seed) 360 train_dataloader = DataLoader(dataset_train, learning_config["train_dataloader"]) 361 val_dataloader = DataLoader(dataset_validation, learning_config["val_dataloader"]) --> 363 trainer = pl.Trainer( 364 callbacks=[ 365 pl.callbacks.model_checkpoint.ModelCheckpoint( 366 filename="checkpointbest{epoch:04d}{step}{ESR:.4f}_{MSE:.3e}", 367 save_top_k=3, 368 monitor="val_loss", 369 every_n_epochs=1, 370 ), 371 pl.callbacks.model_checkpoint.ModelCheckpoint( 372 filename="checkpointlast{epoch:04d}_{step}", every_n_epochs=1 373 ), 374 ], 375 **learning_config["trainer"], 376 ) 377 trainer.fit(model, train_dataloader, val_dataloader) 379 # Go to best checkpoint

File ~/opt/anaconda3/envs/nam/lib/python3.10/site-packages/pytorch_lightning/utilities/argparse.py:348, in _defaults_from_env_vars..insert_env_defaults(self, *args, kwargs) 345 kwargs = dict(list(env_variables.items()) + list(kwargs.items())) 347 # all args were already moved to kwargs --> 348 return fn(self, kwargs)

File ~/opt/anaconda3/envs/nam/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py:420, in Trainer.init(self, logger, enable_checkpointing, callbacks, default_root_dir, gradient_clip_val, gradient_clip_algorithm, num_nodes, num_processes, devices, gpus, auto_select_gpus, tpu_cores, ipus, enable_progress_bar, overfit_batches, track_grad_norm, check_val_every_n_epoch, fast_dev_run, accumulate_grad_batches, max_epochs, min_epochs, max_steps, min_steps, max_time, limit_train_batches, limit_val_batches, limit_test_batches, limit_predict_batches, val_check_interval, log_every_n_steps, accelerator, strategy, sync_batchnorm, precision, enable_model_summary, num_sanity_val_steps, resume_from_checkpoint, profiler, benchmark, deterministic, reload_dataloaders_every_n_epochs, auto_lr_find, replace_sampler_ddp, detect_anomaly, auto_scale_batch_size, plugins, amp_backend, amp_level, move_metrics_to_cpu, multiple_trainloader_mode, inference_mode) 417 # init connectors 418 self._data_connector = DataConnector(self, multiple_trainloader_mode) --> 420 self._accelerator_connector = AcceleratorConnector( 421 num_processes=num_processes, 422 devices=devices, 423 tpu_cores=tpu_cores, 424 ipus=ipus, 425 accelerator=accelerator, 426 strategy=strategy, 427 gpus=gpus, 428 num_nodes=num_nodes, 429 sync_batchnorm=sync_batchnorm, 430 benchmark=benchmark, 431 replace_sampler_ddp=replace_sampler_ddp, 432 deterministic=deterministic, 433 auto_select_gpus=auto_select_gpus, 434 precision=precision, 435 amp_type=amp_backend, 436 amp_level=amp_level, 437 plugins=plugins, 438 ) 439 self._logger_connector = LoggerConnector(self) 440 self._callback_connector = CallbackConnector(self)

File ~/opt/anaconda3/envs/nam/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:202, in AcceleratorConnector.init(self, devices, num_nodes, accelerator, strategy, plugins, precision, amp_type, amp_level, sync_batchnorm, benchmark, replace_sampler_ddp, deterministic, auto_select_gpus, num_processes, tpu_cores, ipus, gpus) 200 self._accelerator_flag = self._choose_auto_accelerator() 201 elif self._accelerator_flag == "gpu": --> 202 self._accelerator_flag = self._choose_gpu_accelerator_backend() 204 self._set_parallel_devices_and_init_accelerator() 206 # 3. Instantiate ClusterEnvironment

File ~/opt/anaconda3/envs/nam/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:537, in AcceleratorConnector._choose_gpu_accelerator_backend() 534 if CUDAAccelerator.is_available(): 535 return "cuda" --> 537 raise MisconfigurationException("No supported gpu backend found!")

MisconfigurationException: No supported gpu backend found!

ServusJon commented 1 year ago

How your command looked like:

(nam) jonathanarnold@MBPvonJonathan NAM conda environment % conda install pytorch torchvision torchaudio cpuonly -c pytorch Collecting package metadata (current_repodata.json): done Solving environment: done

Package Plan

environment location: /Users/jonathanarnold/opt/anaconda3/envs/nam

added / updated specs:

cpuonly
pytorch
torchaudio
torchvision

The following packages will be downloaded:

package                    |            build
---------------------------|-----------------
cpuonly-2.0                |                0           2 KB  pytorch
ffmpeg-4.3                 |       h0a44026_0        10.1 MB  pytorch
gnutls-3.6.15              |       hed9c0bf_0         974 KB
lame-3.100                 |       h1de35cc_0         316 KB
libtasn1-4.16.0            |       h9ed2024_0          53 KB
nettle-3.7.3               |       h230ac6f_1         380 KB
openh264-2.1.1             |       h8346a28_0         655 KB
pytorch-mutex-1.0          |              cpu           3 KB  pytorch
torchaudio-0.13.1          |        py310_cpu         5.6 MB  pytorch
torchvision-0.14.1         |        py310_cpu         6.2 MB  pytorch
------------------------------------------------------------
                                       Total:        24.2 MB

The following NEW packages will be INSTALLED:

cpuonly pytorch/noarch::cpuonly-2.0-0 ffmpeg pytorch/osx-64::ffmpeg-4.3-h0a44026_0 gmp pkgs/main/osx-64::gmp-6.2.1-he9d5cce_3 gnutls pkgs/main/osx-64::gnutls-3.6.15-hed9c0bf_0 lame pkgs/main/osx-64::lame-3.100-h1de35cc_0 libidn2 pkgs/main/osx-64::libidn2-2.3.2-h9ed2024_0 libtasn1 pkgs/main/osx-64::libtasn1-4.16.0-h9ed2024_0 libunistring pkgs/main/osx-64::libunistring-0.9.10-h9ed2024_0 nettle pkgs/main/osx-64::nettle-3.7.3-h230ac6f_1 openh264 pkgs/main/osx-64::openh264-2.1.1-h8346a28_0 pytorch-mutex pytorch/noarch::pytorch-mutex-1.0-cpu torchaudio pytorch/osx-64::torchaudio-0.13.1-py310_cpu torchvision pytorch/osx-64::torchvision-0.14.1-py310_cpu

Proceed ([y]/n)? y

Downloading and Extracting Packages

Preparing transaction: done
Verifying transaction: / WARNING conda.core.path_actions:verify(1094): Unable to create environments file. Path not writable.
environment location: /Users/jonathanarnold/.conda/environments.txt

done
Executing transaction: | WARNING conda.core.envs_manager:register_env(49): Unable to register environment. Path not writable or missing.
environment location: /Users/jonathanarnold/opt/anaconda3/envs/nam
registry file: /Users/jonathanarnold/.conda/environments.txt
done

ServusJon commented 1 year ago

@2-dor any hints?

2-dor commented 1 year ago

@2-dor any hints?

Hey - sorry, nothing yet. I'll try to spin un a "greenfields" virtual machine today (no GPU acceleration) to see if I can get CPU training running.

2-dor commented 1 year ago

@2-dor any hints?

I haven't been able to get it running unfortunately. If I do, I'll post instructions. Looks like it takes a bit more shoveling to trigger the training this way and the "easy" way doesn't seem to take effect.

ServusJon commented 1 year ago

@sdatkinson

sdatkinson commented 1 year ago

Oh boy...there are a lot of different questions in here!

That helped. But I am stuck here @2-dor

(nam) jonathanarnold@MBPvonJonathan NAM conda environment % python bin/train/main.py bin/train/inputs/config_data.json bin/train/inputs/config_model.json bin/train/inputs/config_learning.json bin/train/outputs/BHPFriedmanBE04 Traceback (most recent call last): File "/Users/jonathanarnold/Desktop/_dev/NAM conda environment/bin/train/main.py", line 191, in main(parser.parse_args()) File "/Users/jonathanarnold/Desktop/_dev/NAM conda environment/bin/train/main.py", line 125, in main with open(args.data_config_path, "r") as fp: FileNotFoundError: [Errno 2] No such file or directory: 'bin/train/inputs/config_data.json'

This happens because bin/train/inputs/config_data.json is not a file that exists relative to your current working directory when you run the code. Check the path and make adjustments so that it's the path to a file that you have. You probably want e.g. bin/train/inputs/data/single_pair.json.

To the issue in @2-dor's most recent comment on this thread, the likely reason for this is that the "start" of the interval that is being taken from the audio files is after their end--so it selects and gets nothing 😅 . This is what is supposed to happen with Python's typical array indexing, but probably points to the user making a mistake in this case. I can add a check & have it give an error to bring it to your attention so that the failure is a little less cryptic. I'll put this in a separate issue because it won't be findable in the depths of this thread 🙂

I think that the rest of the thread is addressed in my comments to #94. If there's something else, then you can open another Issue with what's specifically going wrong.

sdatkinson / neural-amp-modeler