fizyr / keras-retinanet

Keras implementation of RetinaNet object detection.
Apache License 2.0
4.38k stars 1.96k forks source link

Loading model fails #1395

Closed dohtem81 closed 4 years ago

dohtem81 commented 4 years ago

Running on MacOS, Docker Desktop 2.3.0.3, python:3.7 image. Installed all dependencies. When loading model resnet50_coco_best_v2.1.0.h5 following log ends up with error

Code pretty straightforward:

from keras_retinanet.models import load_model model = load_model('./models/resnet50_coco_best_v2.1.0.h5', backbone_name='resnet50')

Using TensorFlow backend.
2020-06-23 03:53:58.307402: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2020-06-23 03:53:58.307474: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2020-06-23 03:53:58.307510: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (c701da7b5de1): /proc/driver/nvidia/version does not exist
2020-06-23 03:53:58.307683: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-06-23 03:53:58.313914: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2592000000 Hz
2020-06-23 03:53:58.314329: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f1404000b20 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-06-23 03:53:58.314386: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
Traceback (most recent call last):
  File "conertmodel.py", line 2, in <module>
    model = load_model('./models/resnet50_coco_best_v2.1.0.h5', backbone_name='resnet50')
  File "/usr/local/lib/python3.7/site-packages/keras_retinanet/models/__init__.py", line 87, in load_model
    return keras.models.load_model(filepath, custom_objects=backbone(backbone_name).custom_objects)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/saving/save.py", line 184, in load_model
    return hdf5_format.load_model_from_hdf5(filepath, custom_objects, compile)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/saving/hdf5_format.py", line 178, in load_model_from_hdf5
    custom_objects=custom_objects)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/saving/model_config.py", line 55, in model_from_config
    return deserialize(config, custom_objects=custom_objects)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/layers/serialization.py", line 109, in deserialize
    printable_module_name='layer')
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/utils/generic_utils.py", line 373, in deserialize_keras_object
    list(custom_objects.items())))
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/engine/network.py", line 987, in from_config
    config, custom_objects)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/engine/network.py", line 2019, in reconstruct_from_config
    process_layer(layer_data)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/engine/network.py", line 2001, in process_layer
    layer = deserialize_layer(layer_data, custom_objects=custom_objects)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/layers/serialization.py", line 109, in deserialize
    printable_module_name='layer')
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/utils/generic_utils.py", line 373, in deserialize_keras_object
    list(custom_objects.items())))
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/engine/network.py", line 987, in from_config
    config, custom_objects)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/engine/network.py", line 2029, in reconstruct_from_config
    process_node(layer, node_data)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/engine/network.py", line 1977, in process_node
    output_tensors = layer(input_tensors, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py", line 897, in __call__
    self._maybe_build(inputs)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py", line 2416, in _maybe_build
    self.build(input_shapes)  # pylint:disable=not-callable
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/layers/convolutional.py", line 172, in build
    dtype=self.dtype)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py", line 577, in add_weight
    caching_device=caching_device)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/training/tracking/base.py", line 743, in _add_variable_with_custom_getter
    **kwargs_for_getter)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer_utils.py", line 141, in make_variable
    shape=variable_shape if variable_shape else None)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/ops/variables.py", line 259, in __call__
    return cls._variable_v1_call(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/ops/variables.py", line 220, in _variable_v1_call
    shape=shape)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/ops/variables.py", line 198, in <lambda>
    previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py", line 2598, in default_variable_creator
    shape=shape)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/ops/variables.py", line 263, in __call__
    return super(VariableMetaclass, cls).__call__(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 1434, in __init__
    distribute_strategy=distribute_strategy)
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 1567, in _init_from_args
    initial_value() if init_from_fn else initial_value,
  File "/usr/local/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer_utils.py", line 121, in <lambda>
    init_val = lambda: initializer(shape, dtype=dtype)
  File "/usr/local/lib/python3.7/site-packages/keras_retinanet/initializers.py", line 37, in __call__
    result = np.ones(shape, dtype=dtype) * -math.log((1 - self.probability) / self.probability)
  File "/usr/local/lib/python3.7/site-packages/numpy/core/numeric.py", line 192, in ones
    a = empty(shape, dtype, order)
TypeError: Cannot interpret 'tf.float32' as a data type
ptdw commented 4 years ago

I'm having the same issue with python3.8.

shivareddy37 commented 4 years ago

having same issue with python 3.7 keras=2.4.2 and tensorflow=2.2.0

kkrunal77 commented 4 years ago

having same issue with python 3.7 keras=2.4.3 and tensorflow=2.2.0

eeilerts commented 4 years ago

I had the same problem. Turns out that 8 days ago Keras updated to version 2.4 with a note saying that some workflows may break. I rolled back to keras 2.3.1 and everything worked again:

pip3 install keras==2.3.1

ptdw commented 4 years ago

Thanks @eeilerts, that solved it for me!

dohtem81 commented 4 years ago

Looks like this fixed my problem also. Thanks!

On Jun 25, 2020, at 9:00 PM, ptdw notifications@github.com wrote:

Thanks @eeilerts https://github.com/eeilerts, that solved it for me!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/fizyr/keras-retinanet/issues/1395#issuecomment-649906366, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACETMVYEJ4EXQAO4GFPBNBLRYP6MFANCNFSM4OFIG2VQ.

lulmer commented 4 years ago

Hello, I have the same problem, I replaced all imports by tf.keras (version id 2.3.0-tf) Are tf.keras and keras incompatible ? Thank you

dohtem81 commented 4 years ago

No. All is the same. Just use Keras 2.3.1 instead of latest. I got caught by that because I started to work on something, got a new laptop and when moving source I did not think that Keras version between laptops is not the same. Shame on me ;)

On Tue, Jul 7, 2020 at 7:50 AM Louis Ulmer notifications@github.com wrote:

Hello, I have the same problem, I replaced all imports by tf.keras (version id 2.3.0-tf) Are tf.keras and keras incompatible ? Thank you

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/fizyr/keras-retinanet/issues/1395#issuecomment-654834309, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACETMV7WRM2CTAKC45I54RTR2MKZTANCNFSM4OFIG2VQ .

Prashantmdgl9 commented 4 years ago

Downgrading Keras to 2.3.1 works.

lulmer commented 4 years ago

Indeed, downgrading keras to 2.3.1 works for me as well. Thanks

mooratov commented 4 years ago

is it worth pinning setup.py to Keras 2.3.1? (This is what I have now done as a temporary hotfix on private fork). Otherwise, seems there is a good bit of work to do to get various things compatible with 2.4

hgaiser commented 4 years ago

What is the issue with 2.4?

dohtem81 commented 4 years ago

Cannot load model. Looks like this is the final effect but the root cause seems to be in new Keras. TypeError: Cannot interpret 'tf.float32' as a data type

On Thu, Jul 23, 2020 at 2:03 AM Hans Gaiser notifications@github.com wrote:

What is the issue with 2.4?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/fizyr/keras-retinanet/issues/1395#issuecomment-662850771, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACETMV2PXACT7J4WTTR6BSDR47OFVANCNFSM4OFIG2VQ .

jhoncc2 commented 4 years ago

Hi, I have a similar issue. I'm using Google Colab, Python 3.6.9, tensorflow 2.2

TypeError                                 Traceback (most recent call last)
<ipython-input-3-7ddd8b74fd4e> in <module>()
      5 # load retinanet model
      6 model_path = os.path.join( 'snapshots', 'resnet50_coco_best_v2.1.0.h5')
----> 7 model = models.load_model(model_path, backbone_name='resnet50')
      8 
      9 # if the model is not converted to an inference model, use the line below

31 frames
/usr/local/lib/python3.6/dist-packages/numpy/core/numeric.py in ones(shape, dtype, order)
    205 
    206     """
--> 207     a = empty(shape, dtype, order)
    208     multiarray.copyto(a, 1, casting='unsafe')
    209     return a

TypeError: data type not understood

after changing Keras from 2.4.3 to 2.3.1 I get following error

AttributeError                            Traceback (most recent call last)
<ipython-input-6-7ddd8b74fd4e> in <module>()
      5 # load retinanet model
      6 model_path = os.path.join( 'snapshots', 'resnet50_coco_best_v2.1.0.h5')
----> 7 model = models.load_model(model_path, backbone_name='resnet50')
      8 
      9 # if the model is not converted to an inference model, use the line below

/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py in is_tensor(x)
    701 
    702 def is_tensor(x):
--> 703     return isinstance(x, tf_ops._TensorLike) or tf_ops.is_dense_tensor
    704 
    705 

AttributeError: module 'tensorflow.python.framework.ops' has no attribute '_TensorLike'
jhoncc2 commented 4 years ago

Seems that Colab has changed the versions of Keras and TF. This problems are solved downgrading Keras and TF to versions Keras=2.3.1 and TF=2.1. Using newer versions of Keras and Tensorflow make this problem appear. Hope it helps.

CGuerrero13 commented 4 years ago

Seems that Colab has changed the versions of Keras and TF. This problem does not appear versions Keras 2.3.1, and TF=2.1. I tested in newer versions of Keras and Tensorflow, and the problem persist. Hope it helps

I have tested with keras 2.3.1 and tf 2.2 in Colab and does not work. I changed the versions in requirements.txt file and run it.

jhoncc2 commented 4 years ago

Seems that Colab has changed the versions of Keras and TF. This problem does not appear versions Keras 2.3.1, and TF=2.1. I tested in newer versions of Keras and Tensorflow, and the problem persist. Hope it helps

I have tested with keras 2.3.1 and tf 2.2 in Colab and does not work. I changed the versions in requirements.txt file and run it.

Try downgrading Tf=2.1, it solved this problem to me. i changed my original comment to make it more understandable :)

CGuerrero13 commented 4 years ago

Thank you @jhoncc2 . Colab works with new tensorflow versions. Now the code works!

sri-dhurkesh commented 4 years ago

Seems that Colab has changed the versions of Keras and TF. This problem does not appear versions Keras 2.3.1, and TF=2.1. I tested in newer versions of Keras and Tensorflow, and the problem persist. Hope it helps

I have tested with keras 2.3.1 and tf 2.2 in Colab and does not work. I changed the versions in requirements.txt file and run it.

Try downgrading Tf=2.1, it solved this problem to me. i changed my original comment to make it more understandable :)

thanks alot @jhoncc2

stale[bot] commented 4 years ago

This issue has been automatically marked as stale due to the lack of recent activity. It will be closed if no further activity occurs. Thank you for your contributions.