llSourcell / tensorflow_speech_recognition_demo

This is the code for 'How to Make a Simple Tensorflow Speech Recognizer' by @Sirajology on Youtube
384 stars 248 forks source link

speech_data fails to properly extract .tar #1

Open xkortex opened 7 years ago

xkortex commented 7 years ago

In the demo, I'm getting

Looking for data spoken_numbers_pcm.tar in data/
Extracting data/spoken_numbers_pcm.tar to data/
Data ready!

and then

FileNotFoundError: [Errno 2] No such file or directory: 'data/spoken_numbers_pcm/'

It successfully creates data/ but the file spoken_numbers_pcm.tar fails to extract, I'm left with the plain tar file in the dir.

I don't think it's a permissions thing. Here is the permissions of the downloaded file: -rw-r--r-- 1 mm mm 38M Dec 10 11:20 spoken_numbers_pcm.tar Setting chmod 666 doesn't help, so I don't think that is it.

I'm pretty sure this block in speech_data.maybe_download() is the point of failure:

if os.path.exists(filepath):
    print('Extracting %s to %s' % ( filepath, work_directory))
    os.system('tar xf '+filepath)
    print('Data ready!')

Not sure why it's failing, but I would recommend using the tarfile library for better portability and reliability. Have you looked into using the subprocess library at all? I highly recommend for times when you have to interface with other programs!

System: Python 3.5.2, Jupyter, Mint 18

Cheers,

icecoldlilly commented 7 years ago

It's fixed on the last commit. Go check it out :)