tensorflow / tensor2tensor

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
Apache License 2.0
15.6k stars 3.51k forks source link

Issue with generate_data on speech recognition model #1838

Open nag0811 opened 4 years ago

nag0811 commented 4 years ago

Description

Hi, I am getting the file not found error whenever I run t2t_problem.generate_data(DATA_DIR, TMP_DIR). Download of the data set was successful to TMP_DIR. ...

Environment information

Python 3.7.7 tensor2tensor 1.15.7

OS: <Windows 10>

$ pip freeze | grep tensor
# your output here

$ python -V
# your output here

For bugs: reproduction and error logs

# Steps to reproduce:
...

from tensor2tensor.utils import registry from tensor2tensor import models

registry.list_models()

from tensor2tensor import problems import json

from tensor2tensor.utils.trainer_lib import create_hparams

Print all T2T problems to console

problems.available()

PROBLEM = 'translate_enfr_wmt32k_rev'

PROBLEM = 'librispeech_clean' MODEL = 'TRANSFORMER' HPARAMS = 'transformer_base'

TRAIN_DIR = '~/translator/model_files' TMP_DIR = 'C:/MachineLearning/Speech_Recognition/temp/' DATA_DIR = 'C:/MachineLearning/Speech_Recognition/speech_data/'

t2t_problem = problems.problem(PROBLEM) t2t_problem.generate_data(DATA_DIR, TMP_DIR)

# Error logs:

t2t_problem.generate_data(DATA_DIR, TMP_DIR)
INFO:tensorflow:Not downloading, file already found: C:/MachineLearning/Speech_Recognition/temp/test-clean.tar.gz
INFO:tensorflow:Not downloading, file already found: C:/MachineLearning/Speech_Recognition/temp/test-clean.tar.gz
Traceback (most recent call last):

  File "<ipython-input-11-c1b5d9c269ae>", line 1, in <module>
    t2t_problem.generate_data(DATA_DIR, TMP_DIR)

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensor2tensor\data_generators\librispeech.py", line 169, in generate_data
    self.generator(data_dir, tmp_dir, self.TEST_DATASETS), test_paths)

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensor2tensor\data_generators\generator_utils.py", line 174, in generate_files
    for case in generator:

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensor2tensor\data_generators\librispeech.py", line 149, in generator
    wav_data = audio_encoder.encode(media_file)

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensor2tensor\data_generators\audio_encoder.py", line 62, in encode
    convert_to_wav(s, out_filepath)

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensor2tensor\data_generators\audio_encoder.py", line 51, in convert_to_wav
    call(args + [in_path, out_path])

  File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 339, in call
    with Popen(*popenargs, **kwargs) as p:

  File "C:\ProgramData\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 105, in __init__
    super(SubprocessPopen, self).__init__(*args, **kwargs)

  File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 800, in __init__
    restore_signals, start_new_session)

  File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 1207, in _execute_child
    startupinfo)

FileNotFoundError: [WinError 2] The system cannot find the file specified

...