Open JinmingChe opened 2 years ago
I think the problem is that configs.dataset should be a dictionary and not a string.
If you wanna change it on the python script, I believe you should configs.dataset.dataset="librispeech"
f you wanna do it from the command line you can do:
python hydra_train.py dataset.dataset="librispeech"
Can you show us how you made the command? It could be a command grammar error.
The followling is my command `@hydra.main(config_path=os.path.join("..", "openspeech", "configs"), config_name="train") def hydra_main(configs: DictConfig) -> None: rank_zero_info(OmegaConf.to_yaml(configs)) pl.seed_everything(configs.trainer.seed)
configs['dataset'] = 'librispeech'
# way 2
configs.dataset = 'librispeech'`
I use two ways to add configs.dataset. But they all give me the same error.
Exception has occurred: ConfigKeyError Key 'dataset' is not in struct full_key: dataset object_type=dict The above exception was the direct cause of the following exception: File "/home/chenjinming/github/openspeech/openspeech_cli/hydra_train.py", line 44, in hydra_main configs['dataset'] = 'librispeech'
And I print the configs struct.
{'augment': {'apply_spec_augment': False, 'apply_noise_augment': False, 'apply_joining_augment': False, 'apply_time_stretch_augment': False, 'freq_mask_para': 27, 'freq_mask_num': 2, 'time_mask_num': 4, 'noise_dataset_dir': 'None', 'noise_level': 0.7, 'time_stretch_min_rate': 0.7, 'time_stretch_max_rate': 1.4}, 'trainer': {'seed': 1, 'accelerator': 'dp', 'accumulate_grad_batches': 1, 'num_workers': 4, 'batch_size': 32, 'check_val_every_n_epoch': 1, 'gradient_clip_val': 5.0, 'logger': 'wandb', 'max_epochs': 20, 'save_checkpoint_n_steps': 10000, 'auto_scale_batch_size': 'binsearch', 'sampler': 'smart', 'name': 'gpu', 'device': 'gpu', 'use_cuda': True, 'auto_select_gpus': True}}
It seems that has no key of 'dataset'. My propose is to add a new key or change the default configs setting instead of using the command line.
Did you try configs['dataset']['dataset'] = 'librispeech'?
configs['dataset'] has to be dictionary, which has as keys 'dataset', 'dataset_path', 'dataset_download', and 'manifest_file_path'.
It throws an error when you try configs['dataset'] = 'librispeech', which is string but not dict. Therefore it is removed from the configs and you don't see it when you print it.
❓ Questions & Help
Hello, I am learning how to use openspeech. And I want to set configs in python file, so I can debug easily. The recommended method is to pass parameters on the command.
Details
I try to use configs.dataset = 'librispeech' in hydra_train.py instead of python .hydra_train.py dataset=librispeech. But it gives me the following errors. omegaconf.errors.ConfigAttributeError: Key 'dataset' is not in struct full_key: dataset object_type=dic It is so kind of you to give me some advice about this usage.