castorini / howl

Wake word detection modeling toolkit for Firefox Voice, supporting open datasets like Speech Commands and Common Voice.
Mozilla Public License 2.0
194 stars 28 forks source link

Unable to generate raw dataset. #121

Open BoazOrel opened 1 year ago

BoazOrel commented 1 year ago

Hi, I'm trying to generate a custom dataset as instructed. After the filtering process, I get this error:

Traceback (most recent call last): File "/home/boaz/miniconda3/envs/howlenv/lib/python3.9/runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/boaz/miniconda3/envs/howlenv/lib/python3.9/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/boaz/howl/training/run/generate_raw_audio_dataset.py", line 139, in main( File "/home/boaz/howl/training/run/generate_raw_audio_dataset.py", line 56, in the main raw_dataset_generator.generate_datasets(positive_dataset_path, SampleType.POSITIVE, percentage=positive_pct) File "/home/boaz/howl/howl/dataset/raw_audio_dataset_generator.py", line 92, in generate_datasets dataset.print_stats(self.logger, word_searcher=word_searcher, compute_length=True) File "/home/boaz/howl/howl/data/dataset/dataset.py", line 233, in print_stats log_msg = header + " " TypeError: unsupported operand type(s) for +: 'Logger' and 'str'

I'm using ubuntu 22.04.1 python 3.9 pytorch 1.10 pyaudio-0.2.11

Thanks in advance for the help :)

Neukiru commented 1 year ago

Jut replace log_msg = header + ' ' with log_msg = "Dataset" + ' ' for instance in howl/howl/data/dataset/dataset.py on line 233