harvitronix / five-video-classification-methods

Code that accompanies my blog post outlining five video classification methods in Keras and TensorFlow
https://medium.com/@harvitronix/five-video-classification-methods-implemented-in-keras-and-tensorflow-99cad29cc0b5
MIT License
1.18k stars 478 forks source link

TypeError: Cannot convert 0.0 to EagerTensor of dtype int64 #151

Closed maroacc closed 3 years ago

maroacc commented 3 years ago

Hello, I am using this repo to train the THETIS dataset. When I run train.py I get the following error: TypeError: Cannot convert 0.0 to EagerTensor of dtype int64 The link to my google colab is: https://drive.google.com/drive/folders/1sD2G-wIG1G_M7n3Cjut3edckU5kzNXHL?usp=sharing

The full stack trace is: shell-init: error retrieving current directory: getcwd: cannot access parent directories: Transport endpoint is not connected shell-init: error retrieving current directory: getcwd: cannot access parent directories: Transport endpoint is not connected 2021-07-15 14:39:21.803146: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0 sh: 0: getcwd() failed: Transport endpoint is not connected 2021-07-15 14:39:30.229442: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1 2021-07-15 14:39:30.343516: E tensorflow/stream_executor/cuda/cuda_driver.cc:328] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected 2021-07-15 14:39:30.343591: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (18f8a5729c30): /proc/driver/nvidia/version does not exist 100% 10/10 [00:00<00:00, 1719.33it/s] 2021-07-15 14:39:35.792619: I tensorflow/core/profiler/lib/profiler_session.cc:126] Profiler session initializing. 2021-07-15 14:39:35.792689: I tensorflow/core/profiler/lib/profiler_session.cc:141] Profiler session started. 2021-07-15 14:39:35.808699: I tensorflow/core/profiler/lib/profiler_session.cc:159] Profiler session tear down. Loading LSTM model. /usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py:375: UserWarning: Thelrargument is deprecated, uselearning_rateinstead. "Thelrargument is deprecated, uselearning_rate` instead.") Model: "sequential"


Layer (type) Output Shape Param #

lstm (LSTM) (None, 2048) 33562624


dense (Dense) (None, 512) 1049088


dropout (Dropout) (None, 512) 0


dense_1 (Dense) (None, 2) 1026

Total params: 34,612,738 Trainable params: 34,612,738 Non-trainable params: 0


None /usr/local/lib/python3.7/dist-packages/keras/engine/training.py:1915: UserWarning: Model.fit_generator is deprecated and will be removed in a future version. Please use Model.fit, which supports generators. warnings.warn('Model.fit_generator is deprecated and ' Creating /content/drive/MyDrive/cnn/data/train generator with 4 samples. Traceback (most recent call last): File "/content/drive/MyDrive/cnn/LSTM-video-classification/train.py", line 123, in main() File "/content/drive/MyDrive/cnn/LSTM-video-classification/train.py", line 120, in main load_to_memory=load_to_memory, batch_size=batch_size, nb_epoch=nb_epoch) File "/content/drive/MyDrive/cnn/LSTM-video-classification/train.py", line 84, in train workers=4) File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1932, in fit_generator initial_epoch=initial_epoch) File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1146, in fit for epoch, iterator in data_handler.enumerate_epochs(): File "/usr/local/lib/python3.7/dist-packages/keras/engine/data_adapter.py", line 1182, in enumerate_epochs with self._truncate_execution_to_epoch(): File "/usr/lib/python3.7/contextlib.py", line 112, in enter return next(self.gen) File "/usr/local/lib/python3.7/dist-packages/keras/engine/data_adapter.py", line 1201, in _truncate_execution_to_epoch self._steps_per_execution.assign(self._inferred_steps) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/resource_variable_ops.py", line 892, in assign value_tensor = ops.convert_to_tensor(value, dtype=self.dtype) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/profiler/trace.py", line 163, in wrapped return func(*args, **kwargs) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py", line 1566, in convert_to_tensor ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/tensor_conversion_registry.py", line 52, in _default_conversion_function return constant_op.constant(value, dtype, name=name) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/constant_op.py", line 265, in constant allow_broadcast=True) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/constant_op.py", line 276, in _constant_impl return _constant_eager_impl(ctx, value, dtype, shape, verify_shape) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/constant_op.py", line 301, in _constant_eager_impl t = convert_to_eager_tensor(value, ctx, dtype) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/constant_op.py", line 98, in convert_to_eager_tensor return ops.EagerTensor(value, ctx.device_name, dtype) TypeError: Cannot convert 0.0 to EagerTensor of dtype int64`

Thank you for your time

maroacc commented 3 years ago

Hello, I found the error. I was using a batch size larger that my trainingset.