fizyr / keras-retinanet

Keras implementation of RetinaNet object detection.
Apache License 2.0
4.38k stars 1.96k forks source link

multi gpu model appears to be borken for keras 2.2+, TF 2.1+ #1420

Closed mooratov closed 4 years ago

mooratov commented 4 years ago

getting the following error when trying to use multi gpu training.

Traceback (most recent call last):
  File "/opt/program/keras_retinanet/bin/train.py", line 530, in <module>
    main()
  File "/opt/program/keras_retinanet/bin/train.py", line 490, in main
    config=args.config
  File "/opt/program/keras_retinanet/bin/train.py", line 112, in create_models
    training_model = multi_gpu_model(model, gpus=multi_gpu)
  File "/root/.local/lib/python3.6/site-packages/keras/utils/multi_gpu_utils.py", line 150, in multi_gpu_model
    available_devices = _get_available_devices()
  File "/root/.local/lib/python3.6/site-packages/keras/utils/multi_gpu_utils.py", line 16, in _get_available_devices
    return K.tensorflow_backend._get_available_gpus() + ['/cpu:0']
  File "/root/.local/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 506, in _get_available_gpus
    _LOCAL_DEVICES = tf.config.experimental_list_devices()
AttributeError: module 'tensorflow_core._api.v2.config' has no attribute 'experimental_list_devices'

This appears to be a well known issue with Keras https://github.com/keras-team/keras/issues/13684

One way to solve is to stick to TF 1.15.

However, I've confirmed that the injection solution mentioned in that post works if inserted in train.py right before from keras.utils import multi_gpu_model https://github.com/keras-team/keras/issues/13684#issuecomment-595054461

didn't want to open a PR since it's a bit hacky, but let me know your thoughts.

Don't know if this is still an issue with Keras 2.4, but since that has other incompatibilities with this repo, not able to check.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale due to the lack of recent activity. It will be closed if no further activity occurs. Thank you for your contributions.