Borda / keras-yolo3

A Keras implementation of YOLOv3 (Tensorflow backend) a successor of qqwweee/keras-yolo3
MIT License
31 stars 10 forks source link

predict not working #2

Closed Teque5 closed 5 years ago

Teque5 commented 5 years ago

Thanks a lot for this refactor, it's a million times better than the base repo. Having said that, if I train my model with non 416x416 images, predict is later unable to load the model.

I've tried this with and without the following changes in config_train.json

+    "image-size": [1600, 192],
+    "batch-size": 8,

and train.py

+    'image-size': (1600, 192),
+    'batch-size': 8,

Either way training works fine, with xval_loss as low as 25. But when using predict I always get no matter my attempted fixes in yolo3/yolo.py on self.yolo_model.load_weights(self.weights_path):

Traceback (most recent call last):
  File "/xx/yolo3/yolo.py", line 73, in generate
    self.yolo_model = load_model(self.weights_path, compile=False)
  File "/zz/keras/engine/saving.py", line 419, in load_model
    model = _deserialize_model(f, custom_objects, compile)
  File "/zz/keras/engine/saving.py", line 221, in _deserialize_model
    model_config = f['model_config']
  File "/zz/keras/utils/io_utils.py", line 302, in __getitem__
    raise ValueError('Cannot create group in read only mode.')
ValueError: Cannot create group in read only mode.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/zz/tensorflow/python/framework/ops.py", line 1659, in _create_c_op
    c_op = c_api.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimension 0 in both shapes must be equal, but are 1 and 42. Shapes are [1,1,1024,255] and [42,1024,1,1]. for 'Assign_360' (op: 'Assign') with input shapes: [1,1,1024,255], [42,1024,1,1].

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "scripts/predict.py", line 176, in <module>
    _main(**arg_params)
  File "scripts/predict.py", line 153, in _main
    classes_path=path_classes, gpu_num=gpu_num)
  File "/xx/yolo3/yolo.py", line 59, in __init__
    self.boxes, self.scores, self.classes = self.generate()
  File "/xx/yolo3/yolo.py", line 86, in generate
    self.yolo_model.load_weights(self.weights_path)
  File "/zz/keras/engine/network.py", line 1166, in load_weights
    f, self.layers, reshape=reshape)
  File "/zz/keras/engine/saving.py", line 1058, in load_weights_from_hdf5_group
    K.batch_set_value(weight_value_tuples)
  File "/zz/keras/backend/tensorflow_backend.py", line 2465, in batch_set_value
    assign_op = x.assign(assign_placeholder)
  File "/zz/tensorflow/python/ops/variables.py", line 1762, in assign
    name=name)
  File "/zz/tensorflow/python/ops/state_ops.py", line 223, in assign
    validate_shape=validate_shape)
  File "/zz/tensorflow/python/ops/gen_state_ops.py", line 64, in assign
    use_locking=use_locking, name=name)
  File "/zz/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "/zz/tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/zz/tensorflow/python/framework/ops.py", line 3300, in create_op
    op_def=op_def)
  File "/zz/tensorflow/python/framework/ops.py", line 1823, in __init__
    control_input_ops)
  File "/zz/tensorflow/python/framework/ops.py", line 1662, in _create_c_op
    raise ValueError(str(e))
ValueError: Dimension 0 in both shapes must be equal, but are 1 and 42. Shapes are [1,1,1024,255] and [42,1024,1,1]. for 'Assign_360' (op: 'Assign') with input shapes: [1,1,1024,255], [42,1024,1,1].

Are you sure that predict works as you expect?

The only time it works for me is if it load the default yolo3.h5 model.

Teque5 commented 5 years ago

I should mention I have custom classes, anchors, and annotations - but I am extremely confident that I've created those files correctly. I have the same # of classes and # of anchors as base yolo3.

Borda commented 5 years ago

well I dd not come so far with the refactoring, and I had to freeze this work some time ago... I am not sure when I could come back to this repo... but any advice or help is welcome :] unfortunately the original repo looks also quite dead as I opened PR a half year ago and nothing happened since then :/

Borda commented 5 years ago

just running the training on tiny-yolo it seems that the dimension confusion is somewhere deeper:

INFO:root:Create YOLOv3 (factor: 3) model with 9 anchors and 10 classes.
/home/jb/.local/lib/python3.6/site-packages/keras/engine/saving.py:1140: UserWarning: Skipping loading of weights for layer conv2d_1 due to mismatch in shape ((3, 3, 3, 32) vs (16, 3, 3, 3)).
  weight_values[i].shape))
/home/jb/.local/lib/python3.6/site-packages/keras/engine/saving.py:1140: UserWarning: Skipping loading of weights for layer batch_normalization_1 due to mismatch in shape ((32,) vs (16,)).
  weight_values[i].shape))
/home/jb/.local/lib/python3.6/site-packages/keras/engine/saving.py:1140: UserWarning: Skipping loading of weights for layer conv2d_2 due to mismatch in shape ((3, 3, 32, 64) vs (32, 16, 3, 3)).
  weight_values[i].shape))
/home/jb/.local/lib/python3.6/site-packages/keras/engine/saving.py:1140: UserWarning: Skipping loading of weights for layer batch_normalization_2 due to mismatch in shape ((64,) vs (32,)).
  weight_values[i].shape))
/home/jb/.local/lib/python3.6/site-packages/keras/engine/saving.py:1140: UserWarning: Skipping loading of weights for layer conv2d_3 due to mismatch in shape ((1, 1, 64, 32) vs (64, 32, 3, 3)).
  weight_values[i].shape))
/home/jb/.local/lib/python3.6/site-packages/keras/engine/saving.py:1140: UserWarning: Skipping loading of weights for layer batch_normalization_3 due to mismatch in shape ((32,) vs (64,)).
  weight_values[i].shape))
/home/jb/.local/lib/python3.6/site-packages/keras/engine/saving.py:1140: UserWarning: Skipping loading of weights for layer conv2d_4 due to mismatch in shape ((3, 3, 32, 64) vs (128, 64, 3, 3)).
  weight_values[i].shape))
/home/jb/.local/lib/python3.6/site-packages/keras/engine/saving.py:1140: UserWarning: Skipping loading of weights for layer batch_normalization_4 due to mismatch in shape ((64,) vs (128,)).
  weight_values[i].shape))
/home/jb/.local/lib/python3.6/site-packages/keras/engine/saving.py:1140: UserWarning: Skipping loading of weights for layer conv2d_5 due to mismatch in shape ((3, 3, 64, 128) vs (256, 128, 3, 3)).
  weight_values[i].shape))
/home/jb/.local/lib/python3.6/site-packages/keras/engine/saving.py:1140: UserWarning: Skipping loading of weights for layer batch_normalization_5 due to mismatch in shape ((128,) vs (256,)).
  weight_values[i].shape))
/home/jb/.local/lib/python3.6/site-packages/keras/engine/saving.py:1140: UserWarning: Skipping loading of weights for layer conv2d_6 due to mismatch in shape ((1, 1, 128, 64) vs (512, 256, 3, 3)).
  weight_values[i].shape))
/home/jb/.local/lib/python3.6/site-packages/keras/engine/saving.py:1140: UserWarning: Skipping loading of weights for layer batch_normalization_6 due to mismatch in shape ((64,) vs (512,)).
  weight_values[i].shape))
/home/jb/.local/lib/python3.6/site-packages/keras/engine/saving.py:1140: UserWarning: Skipping loading of weights for layer conv2d_7 due to mismatch in shape ((3, 3, 64, 128) vs (1024, 512, 3, 3)).
  weight_values[i].shape))
/home/jb/.local/lib/python3.6/site-packages/keras/engine/saving.py:1140: UserWarning: Skipping loading of weights for layer batch_normalization_7 due to mismatch in shape ((128,) vs (1024,)).
  weight_values[i].shape))
/home/jb/.local/lib/python3.6/site-packages/keras/engine/saving.py:1140: UserWarning: Skipping loading of weights for layer conv2d_8 due to mismatch in shape ((1, 1, 128, 64) vs (256, 1024, 1, 1)).
  weight_values[i].shape))
/home/jb/.local/lib/python3.6/site-packages/keras/engine/saving.py:1140: UserWarning: Skipping loading of weights for layer batch_normalization_8 due to mismatch in shape ((64,) vs (256,)).
  weight_values[i].shape))
/home/jb/.local/lib/python3.6/site-packages/keras/engine/saving.py:1140: UserWarning: Skipping loading of weights for layer batch_normalization_10 due to mismatch in shape ((256,) vs (128,)).
  weight_values[i].shape))
/home/jb/.local/lib/python3.6/site-packages/keras/engine/saving.py:1140: UserWarning: Skipping loading of weights for layer conv2d_9 due to mismatch in shape ((3, 3, 64, 128) vs (512, 256, 3, 3)).
  weight_values[i].shape))
/home/jb/.local/lib/python3.6/site-packages/keras/engine/saving.py:1140: UserWarning: Skipping loading of weights for layer conv2d_12 due to mismatch in shape ((3, 3, 128, 256) vs (256, 384, 3, 3)).
  weight_values[i].shape))
/home/jb/.local/lib/python3.6/site-packages/keras/engine/saving.py:1140: UserWarning: Skipping loading of weights for layer batch_normalization_9 due to mismatch in shape ((128,) vs (512,)).
  weight_values[i].shape))
/home/jb/.local/lib/python3.6/site-packages/keras/engine/saving.py:1140: UserWarning: Skipping loading of weights for layer batch_normalization_11 due to mismatch in shape ((128,) vs (256,)).
  weight_values[i].shape))
/home/jb/.local/lib/python3.6/site-packages/keras/engine/saving.py:1121: UserWarning: Skipping loading of weights for layer conv2d_10 due to mismatch in number of weights (1 vs 2).
  len(symbolic_weights), len(weight_values)))
/home/jb/.local/lib/python3.6/site-packages/keras/engine/saving.py:1121: UserWarning: Skipping loading of weights for layer conv2d_13 due to mismatch in number of weights (1 vs 2).
Borda commented 5 years ago

see also:

KeKsBoTer commented 5 years ago

I had the same issue and solved it by adding the argument skip_mismatch=True to load_weights in yolo.py:

self.yolo_model.load_weights(self.model_path,by_name=True, skip_mismatch=True)
Borda commented 5 years ago

Thx, that I have changed, but then there is some issue with Yolo head... Still working on it =)

Borda commented 5 years ago

@Teque5 I have been playing around and with this version 25fe475d641ad2946e2f147a5b82a8fd5fc03608 I was able to train the tiny model on VOC dataset and later use it for a sample image and video... If you find something else going wrong, feel free to reopen this issue :)

Borda commented 5 years ago

I have added test training in CricleCI - b46e258ae35cd0021c85fe9392fb7c7ecbe6dedb

arita89 commented 4 years ago

Hello, i am still having this exact issue... how can i fix it?