experiencor / keras-yolo2

Easy training on custom dataset. Various backends (MobileNet and SqueezeNet) supported. A YOLO demo to detect raccoon run entirely in brower is accessible at https://git.io/vF7vI (not on Windows).
MIT License
1.73k stars 787 forks source link

export model to TF #386

Open ltotsas opened 5 years ago

ltotsas commented 5 years ago

hi i am trying to export the model into .pb using https://github.com/amir-abdi/keras_to_tensorflow but i am getting an error : NameError: name 'tf' is not defined Any ideas how to convert the model?

Traceback (most recent call last): File "keras_to_tensorflow.py", line 166, in <module> app.run(main) File "C:\Users\larro\Anaconda3\envs\tensorflow1\lib\site-packages\absl\app.py", line 300, in run _run_main(main, args) File "C:\Users\larro\Anaconda3\envs\tensorflow1\lib\site-packages\absl\app.py", line 251, in _run_main sys.exit(main(argv)) File "keras_to_tensorflow.py", line 112, in main model = load_model(FLAGS.input_model, FLAGS.input_model_json) File "keras_to_tensorflow.py", line 61, in load_model model = keras.models.load_model(input_model_path) File "C:\Users\larro\Anaconda3\envs\tensorflow1\lib\site-packages\keras\engine\saving.py", line 419, in load_model model = _deserialize_model(f, custom_objects, compile) File "C:\Users\larro\Anaconda3\envs\tensorflow1\lib\site-packages\keras\engine\saving.py", line 225, in _deserialize_model model = model_from_config(model_config, custom_objects=custom_objects) File "C:\Users\larro\Anaconda3\envs\tensorflow1\lib\site-packages\keras\engine\saving.py", line 458, in model_from_config return deserialize(config, custom_objects=custom_objects) File "C:\Users\larro\Anaconda3\envs\tensorflow1\lib\site-packages\keras\layers\__init__.py", line 55, in deserialize printable_module_name='layer') File "C:\Users\larro\Anaconda3\envs\tensorflow1\lib\site-packages\keras\utils\generic_utils.py", line 145, in deserialize_keras_object list(custom_objects.items()))) File "C:\Users\larro\Anaconda3\envs\tensorflow1\lib\site-packages\keras\engine\network.py", line 1022, in from_config process_layer(layer_data) File "C:\Users\larro\Anaconda3\envs\tensorflow1\lib\site-packages\keras\engine\network.py", line 1008, in process_layer custom_objects=custom_objects) File "C:\Users\larro\Anaconda3\envs\tensorflow1\lib\site-packages\keras\layers\__init__.py", line 55, in deserialize printable_module_name='layer') File "C:\Users\larro\Anaconda3\envs\tensorflow1\lib\site-packages\keras\utils\generic_utils.py", line 145, in deserialize_keras_object list(custom_objects.items()))) File "C:\Users\larro\Anaconda3\envs\tensorflow1\lib\site-packages\keras\engine\network.py", line 1032, in from_config process_node(layer, node_data) File "C:\Users\larro\Anaconda3\envs\tensorflow1\lib\site-packages\keras\engine\network.py", line 991, in process_node layer(unpack_singleton(input_tensors), **kwargs) File "C:\Users\larro\Anaconda3\envs\tensorflow1\lib\site-packages\keras\engine\base_layer.py", line 457, in __call__ output = self.call(inputs, **kwargs) File "C:\Users\larro\Anaconda3\envs\tensorflow1\lib\site-packages\keras\layers\core.py", line 687, in call return self.function(inputs, **arguments) File "C:\Users\larro\Desktop\keras-yolo2-master\backend.py", line 44, in space_to_depth_x2 return tf.space_to_depth(x, block_size=2) NameError: name 'tf' is not defined

rodrigo2019 commented 5 years ago

I think you are getting this error because when you use the Full yolo backend it uses tf._space_to_depth in order to create a residual layer, and when you try to load your model the keras function miss the tensorflow library. you can fix it doing something like this:

import tensorflow as tf
from keras.models import load_model
...
model = load_model("yourModel.h5", custom_objects={"tf":tf})

probably after to fix this error, you will get more errors like this, because the keras will not find functions like the custom loss function for the yolo model.

maybe this function can help you to load the model without problems

ltotsas commented 5 years ago

@rodrigo2019 i am trying to export the model into TensorFlow serving format and up to now i get all the time errors. I managed to fix the error as you said but afterwards more errors are coming as again you described. I changed the loss function to loss='mean_squared_error' but again.

BTW i am using your fork but still no it doesn't help the function you mentioned. It returns "y_pred" as argument must be defined.

Is it possible to use YOLO trained with Keras from this script with TensorFlow serving? Please, i really need help as its part of my thesis to train Yolo and test it using TensorFlow serving

rodrigo2019 commented 5 years ago

Yes, you can run it directly in tensorflow. I think that is interesting to you the use of the "get_inference_model", because this function cuts out the lambda layer that is not usefull for inference. So follow this steps:

1 - Load your model using the YOLO class

yolo = YOLO(backend             = config['model']['backend'],
                input_size          = (config['model']['input_size_h'],config['model']['input_size_w']), 
                labels              = config['model']['labels'], 
                max_box_per_image   = config['model']['max_box_per_image'],
                anchors             = config['model']['anchors'],
                gray_mode           = config['model']['gray_mode'])

2 - save the inference model

model = yolo.get_inference_model()
model.save("inference.h5")

3 - load the inference model and convert to tensorflow using your script model = load_model("inference.h5", custom_objects={"tf":tf})

ltotsas commented 5 years ago

And it worked! @rodrigo2019 you are a god! I inspected the signature of the model and i got this

  The given SavedModel SignatureDef contains the following input(s):
    inputs['input_image'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 416, 416, 3)
        name: input_1:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['YOLO_output/Reshape:0'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 13, 13, 5, 6)
        name: YOLO_output/Reshape:0
  Method name is: tensorflow/serving/predict

The problem is that TensorFlow Serving can't understand the signature as it gives an error

Error occurred: Error: 3 INVALID_ARGUMENT: input tensor alias not found in signature: examples. Inputs expected to be in the set {input_image}.

Thats the only thing that is left. To infer an image with prediction score and boundary boxes. When i tried to predict something using the script the boundary boxes was there and of course the prediction score. Any ideas how to solve this? Thanks again for your help, i really appreciate

EDIT: to export the model i did this

import tensorflow as tf
# The export path contains the name and the version of the model
tf.keras.backend.set_learning_phase(0) # Ignore dropout at inference
model = tf.keras.models.load_model('inference.h5', custom_objects={"tf":tf})
export_path = './raccoon/1'
# Fetch the Keras session and save the model
# The signature definition is defined by the input and output tensors
# And stored with the default serving key
with tf.keras.backend.get_session() as sess:
    tf.saved_model.simple_save(
        sess,
        export_path,
        inputs={'input_image': model.input},
        outputs={t.name:t for t in model.outputs})
rodrigo2019 commented 5 years ago

I don't know nothing about tensorflow serving, but it looks like that you're not setting the input alias into "input_image" tensor. Maybe if you share your code I can help you.

ltotsas commented 5 years ago

i already changed it but now i get Error occurred: Error: 3 INVALID_ARGUMENT: input must be 4-dimensional[4] [[Node: Full_YOLO_backend/conv_1/Conv2D = Conv2D[T=DT_FLOAT, _output_shapes=[[?,416,416,32]], data_format="NHWC", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_input_1_0_0, Full_YOLO_backend/conv_1/Conv2D/ReadVariableOp)]] and the code i run is in Node.js

// Create a protobuf for Tensorflow Serving predict request
    // tslint:disable-next-line:no-any
    private buildPredictRequest(buffer: Array<any>): Object {
        console.log(buffer);
        const request = {
            model_spec: {
                name: this.modelName,
                signature_name: this.signatureName,
            },
            inputs: {
                input_image: {
                    dtype: 'DT_FLOAT',
                    tensor_shape: {
                        dim: {
                            size: buffer.length,
                        },
                    },
                    string_val: buffer,
                },
            },
        };
        return request;
rodrigo2019 commented 5 years ago

Did you did the preprocessing normalization in the image? like this:

image = cv2.imread("image.jpg")
image = cv2.resize(image,(416,416))
image = image/255.0
image = image[np.newaxis]

reading the error looks like you forgot to create the fourth dimensionality doing this: image = image[np.newaxis]

the image shape will change from [416,416,3] to [1,416,416,3]

ltotsas commented 5 years ago

Whatever i tried it was impossible for me to do what you said! I don't know why probably NodeJs can't do something like that. I tried also to run as python script and invoke inside NodeJs but still the same, 4 dimensions error. A question about that is if i do in order to use it in Node in theory it should be with 4 dimensions. Right?

image = image[np.newaxis]
print(image)

Also, is it possible using the predict script to print in console the class, the annotations and the score?

rodrigo2019 commented 5 years ago

image

Also, is it possible using the predict script to print in console the class, the annotations and the score?

yes, it is. you shoudl edit the script here

ltotsas commented 5 years ago

@rodrigo2019 thank you for all your help.

ltotsas commented 5 years ago

@rodrigo2019 hi again. I am training a custom dataset from scratch but i am having some problems. I am using as i told your repo and i am using CSV for training.

For some reason in the end of every epoch the mAP is zero, for each of the labels.

Also, when i am trying to predict a photo it gives me

Cannot feed value of shape (35, 1024, 1, 1) for Tensor 'Placeholder_110:0', which has shape '(1, 1, 1024, 25)'

i also try to calculate manually the anchors but it runs forever.

Any ideas?

rodrigo2019 commented 5 years ago

how many classes and anchor box do you have? some times I get a training that results zero for mAP for around 5hours, after that I start to get some values, usually it occurs in hard problems. it is hard to tell, but the problems looks like that is your dataset,

ltotsas commented 5 years ago

Ok OOK!! so it's not from the CSV dataset. this is my config and probably i have to disable early stopping.

{
  "model": {
    "backend": "Full Yolo",
    "input_size_w": 416,
    "input_size_h": 416,
    "gray_mode": false,
    "anchors": [
      0.57273,
      0.677385,
      1.87446,
      2.06253,
      3.33843,
      5.47434,
      7.88282,
      3.52778,
      9.77052,
      9.16828
    ],
    "max_box_per_image": 10,
    "labels": ["eva", "lazaros"]
  },
  "parser_annotation_type": "csv",
  "train": {
    "train_csv_file": "/Users/larry/Documents/Projects/thesis/detector-api/projects/final/annotations.csv",
    "train_csv_base_path": "/Users/larry/Documents/Projects/thesis/detector-api/projects/final/images/",
    "train_image_folder": "",
    "train_annot_folder": "",
    "train_times": 1,
    "pretrained_weights": "/Users/larry/Documents/Projects/thesis/detector-api/projects/final/larrougos.h5",
    "batch_size": 8,
    "learning_rate": 0.0001,
    "nb_epochs": 20,
    "warmup_epochs": 3,
    "workers": 12,
    "max_queue_size": 10,
    "early_stop": true,
    "tensorboard_log_dir": "/Users/larry/Documents/Projects/thesis/detector-api/projects/final/logs/",
    "object_scale": 5,
    "no_object_scale": 1,
    "coord_scale": 1,
    "class_scale": 1,
    "saved_weights_name": "/Users/larry/Documents/Projects/thesis/detector-api/projects/final/larrougos.h5",
    "debug": false
  },
  "valid": {
    "valid_csv_file": "",
    "valid_csv_base_path": "",
    "valid_image_folder": "",
    "valid_annot_folder": "",
    "valid_times": 1
  },
  "backup": {
    "create_backup": false,
    "redirect_model": true,
    "backup_path": "../backup",
    "backup_prefix": "Tiny_yolo_VOC"
  }
}
rodrigo2019 commented 5 years ago

You have 2 classes and 5 anchors, it results in a array with size 35, so it looks you are feeding a correct size, but the network are expecting a size 25, maybe you have found a bug in the CSV parser, could you try using labels: []?

ltotsas commented 5 years ago

I think, i think that it was nothing more that i stopped the training early and the second time i tried it didn't gave me the error. Of course the training is very poor as from 5 images i tried it only recognised one image and really bad.

The error was for sure when i had labels: [], after i prefilled the labels with the two classes. But i dont think thats the problem for sure, bz i can see that from the csv parser it find two classes.

rodrigo2019 commented 5 years ago

I do not know, I think you need discover it by the empirical way

ltotsas commented 5 years ago

i think i've found a bug as you mentioned yesterday. If in my config i don't have any labels during training everything is fine, with one class. If i try to predict again sthg with the same config i get the

Cannot feed value of shape (30, 1024, 1, 1) for Tensor 'Placeholder_110:0', which has shape '(1, 1, 1024, 25)'

The error disappear if i add the class into the label array.

Could it be the case that because of that i always have 0 mAP and i need to have also for the training the classes?

rodrigo2019 commented 5 years ago

I think the problem is because you have a small dataset, and when the dataset is split into train and validation, the training dataset don't have all classes for training.

ltotsas commented 5 years ago

Right now my dataset is 230 images, and for the training it uses 201 images. Do i need more? Right now i have

acc : 0.4
loss : 0.0644
val_acc : 0.4073
val_loss : 0.0062
mAP : 404 

I just started the training again but this time i filled inside the config the labels rather depending on the the parser. Let's see.

YunYang1994 commented 5 years ago

hope this helps you ! https://github.com/YunYang1994/tensorflow-yolov3

ricardo-0x07 commented 5 years ago

@rodrigo2019 where is the function get_inference_model

rodrigo2019 commented 5 years ago

@ricardo-0x07 in my fork