rizkiarm / LipNet

Keras implementation of 'LipNet: End-to-End Sentence-level Lipreading'
MIT License
644 stars 230 forks source link

ValueError: output of generator should be a tuple `(x, y, sample_weight)` or `(x, y)`. Found: None #11

Open rad182 opened 7 years ago

rad182 commented 7 years ago

Hi, I'm using the GRID s1 sample videos to train (unseen_speakers) and I'm getting this error. Any idea?

Using all available GPUs.
Using TensorFlow backend.

Enumerating dataset list from disk...
Found 10 videos for training.
Found 10 videos for validation.

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
the_input (InputLayer)       (None, 75, 100, 50, 3)    0
_________________________________________________________________
zero1 (ZeroPadding3D)        (None, 77, 104, 54, 3)    0
_________________________________________________________________
conv1 (Conv3D)               (None, 75, 50, 25, 32)    7232
_________________________________________________________________
batc1 (BatchNormalization)   (None, 75, 50, 25, 32)    128
_________________________________________________________________
actv1 (Activation)           (None, 75, 50, 25, 32)    0
_________________________________________________________________
spatial_dropout3d_1 (Spatial (None, 75, 50, 25, 32)    0
_________________________________________________________________
max1 (MaxPooling3D)          (None, 75, 25, 12, 32)    0
_________________________________________________________________
zero2 (ZeroPadding3D)        (None, 77, 29, 16, 32)    0
_________________________________________________________________
conv2 (Conv3D)               (None, 75, 25, 12, 64)    153664
_________________________________________________________________
batc2 (BatchNormalization)   (None, 75, 25, 12, 64)    256
_________________________________________________________________
actv2 (Activation)           (None, 75, 25, 12, 64)    0
_________________________________________________________________
spatial_dropout3d_2 (Spatial (None, 75, 25, 12, 64)    0
_________________________________________________________________
max2 (MaxPooling3D)          (None, 75, 12, 6, 64)     0
_________________________________________________________________
zero3 (ZeroPadding3D)        (None, 77, 14, 8, 64)     0
_________________________________________________________________
conv3 (Conv3D)               (None, 75, 12, 6, 96)     165984
_________________________________________________________________
batc3 (BatchNormalization)   (None, 75, 12, 6, 96)     384
_________________________________________________________________
actv3 (Activation)           (None, 75, 12, 6, 96)     0
_________________________________________________________________
spatial_dropout3d_3 (Spatial (None, 75, 12, 6, 96)     0
_________________________________________________________________
max3 (MaxPooling3D)          (None, 75, 6, 3, 96)      0
_________________________________________________________________
time_distributed_1 (TimeDist (None, 75, 1728)          0
_________________________________________________________________
bidirectional_1 (Bidirection (None, 75, 512)           3048960
_________________________________________________________________
bidirectional_2 (Bidirection (None, 75, 512)           1181184
_________________________________________________________________
dense1 (Dense)               (None, 75, 28)            14364
_________________________________________________________________
softmax (Activation)         (None, 75, 28)            0
=================================================================
Total params: 4,572,156.0
Trainable params: 4,571,772.0
Non-trainable params: 384.0
_________________________________________________________________
nextVal [<lipnet.helpers.threadsafe.threadsafe_iter instance at 0x1155eecf8>]
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
Epoch 0: Curriculum(train: True, sentence_length: -1, flip_probability: 0.5, jitter_probability: 0.05)
Process Process-1:
Traceback (most recent call last):
  File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python2.7/site-packages/keras/engine/training.py", line 606, in data_generator_task
Epoch 1/5000
Traceback (most recent call last):
  File "/Users/rad182/Documents/rizkiarm-LipNet/training/unseen_speakers/train.py", line 78, in <module>
    train(run_name, 0, 5000, 3, 100, 50, 75, 32, 10)
  File "/Users/rad182/Documents/rizkiarm-LipNet/training/unseen_speakers/train.py", line 74, in train
    pickle_safe=True)
  File "/usr/local/lib/python2.7/site-packages/keras/legacy/interfaces.py", line 88, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/keras/engine/training.py", line 1851, in fit_generator
    str(generator_output))
ValueError: output of generator should be a tuple `(x, y, sample_weight)` or `(x, y)`. Found: None
michiyosony commented 7 years ago

@rad182 I was getting a similar error, but after deleting datasets.cache (and possibly some other change that I'm not aware of?) the stack trace changed to

Process Process-1:
Epoch 0: Curriculum(train: True, sentence_length: -1, flip_probability: 0.5, jitter_probability: 0.05)
Traceback (most recent call last):
  File "/usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/m/tensorflow/lib/python2.7/site-packages/keras/engine/training.py", line 606, in data_generator_task
    generator_output = next(self._generator)
  File "/Users/m/repos/LipNet/lipnet/helpers/threadsafe.py", line 16, in next
    return self.it.next()
  File "/Users/m/repos/LipNet/lipnet/lipreading/generators.py", line 206, in next_train
    ret = self.get_batch(cur_train_index, self.minibatch_size, train=True)
  File "/Users/m/repos/LipNet/lipnet/lipreading/generators.py", line 148, in get_batch
    video = Video().from_frames(path)
  File "/Users/m/repos/LipNet/lipnet/lipreading/videos.py", line 114, in from_frames
    frames_path = sorted([os.path.join(path, x) for x in os.listdir(path)])
OSError: [Errno 20] Not a directory: '/Users/m/repos/LipNet/training/unseen_speakers/datasets/train/s1/bbas3a.mpg'
Traceback (most recent call last):
  File "/Applications/PyCharm CE.app/Contents/helpers/pydev/pydevd.py", line 1596, in <module>
    globals = debugger.run(setup['file'], None, None, is_module)
  File "/Applications/PyCharm CE.app/Contents/helpers/pydev/pydevd.py", line 1023, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/Users/m/repos/LipNet/training/unseen_speakers/train.py", line 76, in <module>
    train(run_name, 0, 5000, 3, 100, 50, 75, 32, 10)
  File "/Users/m/repos/LipNet/training/unseen_speakers/train.py", line 72, in train
    pickle_safe=True)
  File "/Users/m/tensorflow/lib/python2.7/site-packages/keras/legacy/interfaces.py", line 88, in wrapper
    return func(*args, **kwargs)
  File "/Users/m/tensorflow/lib/python2.7/site-packages/keras/engine/training.py", line 1851, in fit_generator
    str(generator_output))
ValueError: output of generator should be a tuple `(x, y, sample_weight)` or `(x, y)`. Found: None

The Not a directory: '/Users/m/repos/LipNet/training/unseen_speakers/datasets/train/s1/bbas3a.mpg' seemed to be caused by an assumption that we were loading the videos from frames instead of videos, (see lipnet/lipreading/generators.py, but changing that call from from_frames to from_video (and then addressing subsequent issues) didn't get me anywhere.

I did find that preprocessing the videos using scripts/extract_mouth_batch.py seems to work--at least, I was able to produce a weightsxx.h file.

deepakgupta1313 commented 7 years ago

@rad182 @michiyosony @rizkiarm Similar issue. Any help is appreciated.

Using all available GPUs.
Using TensorFlow backend.
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally

Enumerating dataset list from disk...
Video /home/deepakgupta1313/Desktop/Deepak/Programs/Github/LipNet/training/unseen_speakers/datasets/train/s1/bbizzn.mpg has incorrect shape (75, 360, 288, 3), must be (75, 100, 50, 3)
Video /home/deepakgupta1313/Desktop/Deepak/Programs/Github/LipNet/training/unseen_speakers/datasets/val/s1/bbizzn.mpg has incorrect shape (75, 360, 288, 3), must be (75, 100, 50, 3)
Found 50 videos for training.
Found 50 videos for validation.

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
the_input (InputLayer)       (None, 75, 100, 50, 3)    0         
_________________________________________________________________
zero1 (ZeroPadding3D)        (None, 77, 104, 54, 3)    0         
_________________________________________________________________
conv1 (Conv3D)               (None, 75, 50, 25, 32)    7232      
_________________________________________________________________
batc1 (BatchNormalization)   (None, 75, 50, 25, 32)    128       
_________________________________________________________________
actv1 (Activation)           (None, 75, 50, 25, 32)    0         
_________________________________________________________________
spatial_dropout3d_1 (Spatial (None, 75, 50, 25, 32)    0         
_________________________________________________________________
max1 (MaxPooling3D)          (None, 75, 25, 12, 32)    0         
_________________________________________________________________
zero2 (ZeroPadding3D)        (None, 77, 29, 16, 32)    0         
_________________________________________________________________
conv2 (Conv3D)               (None, 75, 25, 12, 64)    153664    
_________________________________________________________________
batc2 (BatchNormalization)   (None, 75, 25, 12, 64)    256       
_________________________________________________________________
actv2 (Activation)           (None, 75, 25, 12, 64)    0         
_________________________________________________________________
spatial_dropout3d_2 (Spatial (None, 75, 25, 12, 64)    0         
_________________________________________________________________
max2 (MaxPooling3D)          (None, 75, 12, 6, 64)     0         
_________________________________________________________________
zero3 (ZeroPadding3D)        (None, 77, 14, 8, 64)     0         
_________________________________________________________________
conv3 (Conv3D)               (None, 75, 12, 6, 96)     165984    
_________________________________________________________________
batc3 (BatchNormalization)   (None, 75, 12, 6, 96)     384       
_________________________________________________________________
actv3 (Activation)           (None, 75, 12, 6, 96)     0         
_________________________________________________________________
spatial_dropout3d_3 (Spatial (None, 75, 12, 6, 96)     0         
_________________________________________________________________
max3 (MaxPooling3D)          (None, 75, 6, 3, 96)      0         
_________________________________________________________________
time_distributed_1 (TimeDist (None, 75, 1728)          0         
_________________________________________________________________
bidirectional_1 (Bidirection (None, 75, 512)           3048960   
_________________________________________________________________
bidirectional_2 (Bidirection (None, 75, 512)           1181184   
_________________________________________________________________
dense1 (Dense)               (None, 75, 28)            14364     
_________________________________________________________________
softmax (Activation)         (None, 75, 28)            0         
=================================================================
Total params: 4,572,156.0
Trainable params: 4,571,772.0
Non-trainable params: 384.0
_________________________________________________________________
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:910] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties: 
name: GeForce GTX 1070
major: 6 minor: 1 memoryClockRate (GHz) 1.645
pciBusID 0000:01:00.0
Total memory: 7.92GiB
Free memory: 7.25GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0)
Epoch 0: Curriculum(train: True, sentence_length: -1, flip_probability: 0.5, jitter_probability: 0.05)
Process Process-1:
Traceback (most recent call last):
  File "/home/deepakgupta1313/anaconda3/envs/py27/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/home/deepakgupta1313/anaconda3/envs/py27/lib/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "/home/deepakgupta1313/.local/lib/python2.7/site-packages/keras/engine/training.py", line 606, in data_generator_task
    generator_output = next(self._generator)
  File "/home/deepakgupta1313/Desktop/Deepak/Programs/Github/LipNet/lipnet/helpers/threadsafe.py", line 16, in next
    return self.it.next()
  File "/home/deepakgupta1313/Desktop/Deepak/Programs/Github/LipNet/lipnet/lipreading/generators.py", line 205, in next_train
    ret = self.get_batch(cur_train_index, self.minibatch_size, train=True)
  File "/home/deepakgupta1313/Desktop/Deepak/Programs/Github/LipNet/lipnet/lipreading/generators.py", line 147, in get_batch
    video = Video().from_frames(path)
  File "/home/deepakgupta1313/Desktop/Deepak/Programs/Github/LipNet/lipnet/lipreading/videos.py", line 114, in from_frames
    frames_path = sorted([os.path.join(path, x) for x in os.listdir(path)])
OSError: [Errno 20] Not a directory: '/home/deepakgupta1313/Desktop/Deepak/Programs/Github/LipNet/training/unseen_speakers/datasets/train/s1/bbbmzn.mpg'
Epoch 1/5000
Traceback (most recent call last):
  File "/home/deepakgupta1313/Desktop/Deepak/Programs/Github/LipNet/training/unseen_speakers/train.py", line 77, in <module>
    train(run_name, 0, 5000, 3, 100, 50, 75, 32, 50)
  File "/home/deepakgupta1313/Desktop/Deepak/Programs/Github/LipNet/training/unseen_speakers/train.py", line 73, in train
    pickle_safe=True)
  File "/home/deepakgupta1313/.local/lib/python2.7/site-packages/keras/legacy/interfaces.py", line 88, in wrapper
    return func(*args, **kwargs)
  File "/home/deepakgupta1313/.local/lib/python2.7/site-packages/keras/engine/training.py", line 1851, in fit_generator
    str(generator_output))
ValueError: output of generator should be a tuple `(x, y, sample_weight)` or `(x, y)`. Found: None
Yuren-Zhong commented 6 years ago

Hi, I got similar problems.

Have you guys figured out how to fix them? @rad182 @michiyosony @deepakgupta1313

Thanks!

michiyosony commented 6 years ago

I am able to train on mouth crops. I was not able to figure out how to train on videos directly.

DerekChia commented 5 years ago

I was too able to train on frames (after running scripts/extract_mouth_batch.py). For those struggling to get this to work, please extract the lip/mouth into frames before training.

ghost commented 5 years ago

hi @DerekChia. we extracted the mouth frames using scripts/extract_mouth_batch.py and now for each video we have directory containing 75 cropped frames around the mouth. The problem is when we try to train we get the error - /329/LipNet/training/random_split/datasets/video/bbal8p/mouth_063.png has incorrect shape (1, 100, 50, 3), must be (75, 100, 50, 3) Video /cs/engproj/329/LipNet/training/random_split/datasets/video/bbal8p/mouth_018.png has incorrect shape (1, 100, 50, 3), must be (75, 100, 50, 3) Video /cs/engproj/329/LipNet/training/random_split/datasets/video/bbal8p/mouth_064.png has incorrect shape (1, 100, 50, 3), must be (75, 100, 50, 3) Video /cs/engproj/329/LipNet/training/random_split/datasets/video/bbal8p/mouth_016.png has incorrect shape (1, 100, 50, 3), must be (75, 100, 50, 3) Video /cs/engproj/329/LipNet/training/random_split/datasets/video/bbal8p/mouth_011.png has incorrect shape (1, 100, 50, 3), must be (75, 100, 50, 3) Video /cs/engproj/329/LipNet/training/random_split/datasets/video/bbal8p/mouth_058.png has incorrect shape (1, 100, 50, 3), must be (75, 100, 50, 3) Video /cs/engproj/329/LipNet/training/random_split/datasets/video/bbal8p/mouth_024.png has incorrect shape (1, 100, 50, 3), must be (75, 100, 50, 3) Video /cs/engproj/329/LipNet/training/random_split/datasets/video/bbal8p/mouth_023.png has incorrect shape (1, 100, 50, 3), must be (75, 100, 50, 3)

So its looks likes it wants videos....

jiagnhaiyang commented 4 years ago

各位大神,这个问题ValueError: output of generator should be a tuple (x, y, sample_weight) or (x, y). Found: None怎么解决?

jiagnhaiyang commented 4 years ago

@rad182 @michiyosony @Yuren-Zhong @deepakgupta1313 @DerekChia I also met this problem, how did you solve it, thank you very much!