castorini / howl

Wake word detection modeling toolkit for Firefox Voice, supporting open datasets like Speech Commands and Common Voice.
Mozilla Public License 2.0
194 stars 28 forks source link

pydantic. Preparing a Dataset problems #125

Open artem-tok opened 10 months ago

artem-tok commented 10 months ago

Hello, I am trying to prepare dataset as in instruction, but having problems with pydantic. Some of them I solved by using from pydantic_settings import BaseSettings instead of from pydantic import BaseSettings in scripts settings.py and config.py, but still get an error

2023-11-09 19:54:55 WARNING setup_logger(30) Removing existing handlers from generate_raw_audio_dataset.py logger 2023-11-09 19:54:55,642 INFO setup_logger(54) Set up logger (generate_raw_audio_dataset.py), output path: None Traceback (most recent call last): File "", line 198, in _run_module_as_main File "", line 88, in _run_code File "/home/artem/Artem/gaz/wake_w/howl/howl/training/run/generate_raw_audio_dataset.py", line 139, in main( File "/home/artem/Artem/gaz/wake_w/howl/howl/training/run/generate_raw_audio_dataset.py", line 35, in main raw_dataset_generator = RawAudioDatasetGenerator(input_audio_dataset_path, dataset_type, logger) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/artem/Artem/gaz/wake_w/howl/howl/howl/dataset/raw_audio_dataset_generator.py", line 35, in init self.inference_ctx = InferenceContext(vocab=SETTINGS.training.vocab, token_type=SETTINGS.training.token_type) ^^^^^^^^^^^^^^^^^ File "/home/artem/Artem/gaz/wake_w/howl/howl/howl/settings.py", line 139, in training self._training = TrainingSettings() ^^^^^^^^^^^^^^^^^^ File "/home/artem/anaconda3/lib/python3.11/site-packages/pydantic_settings/main.py", line 71, in init super().init( File "/home/artem/anaconda3/lib/python3.11/site-packages/pydantic/main.py", line 164, in init pydantic_self.pydantic_validator.validate_python(data, self_instance=__pydantic_self__) pydantic_core._pydantic_core.ValidationError: 1 validation error for TrainingSettings phone_dictionary Input should be a valid string [type=string_type, input_value=None, input_type=NoneType] For further information visit https://errors.pydantic.dev/2.4/v/string_type