patlevin / tfjs-to-tf

A TensorFlow.js Graph Model Converter
MIT License
138 stars 18 forks source link

Tensor shape info mismatch in SignatureDef when model is missing meta data #15

Closed patlevin closed 4 years ago

patlevin commented 4 years ago

This is really great, I really appreciate the quick fix! As you said, it now pulls TF2.3 and performs the conversion.

However, while testing it, I noticed that the tensor shape is different from my workaround method.

SavedModel via workaround:

$ saved_model_cli show --dir /posenet/saved_model_workaround --all

MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['sub_2'] tensor_info:
        dtype: DT_FLOAT
        shape: (1, -1, -1, 3)
        name: sub_2:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['float_heatmaps'] tensor_info:
        dtype: DT_FLOAT
        shape: (1, -1, -1, 17)
        name: float_heatmaps:0
    outputs['float_short_offsets'] tensor_info:
        dtype: DT_FLOAT
        shape: (1, -1, -1, 34)
        name: float_short_offsets:0
  Method name is: tensorflow/serving/predict

SavedModel via tfjs-graph-converter:

$ saved_model_cli show --dir /posenet/saved_model_converter --all

MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['sub_2:0'] tensor_info:
        dtype: DT_FLOAT
        shape: (1, -1, -1, 3)
        name: sub_2:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['float_heatmaps:0'] tensor_info:
        dtype: DT_FLOAT
        shape: (17)
        name: Const_114:0
    outputs['float_short_offsets:0'] tensor_info:
        dtype: DT_FLOAT
        shape: (34)
        name: Const_112:0
    outputs['resnet_v1_50/displacement_bwd_2/BiasAdd:0'] tensor_info:
        dtype: DT_FLOAT
        shape: (32)
        name: Const_110:0
    outputs['resnet_v1_50/displacement_fwd_2/BiasAdd:0'] tensor_info:
        dtype: DT_FLOAT
        shape: (32)
        name: Const:0
  Method name is: tensorflow/serving/predict

For example, when I perform inference on a sample image which has 1280x720 pixels using PoseNet with stride 16, the response from model produced by tfjs-graph-converter is quite different (significant part of information is missing so it is not possible to determine positions of the keypoints).

Model tf_result_proto.ByteSize() float_heatmaps.shape float_short_offsets.shape
saved_model_workaround 734533 (1, 45, 80, 17) (1, 45, 80, 34)
saved_model_converter 704 (17) (34)

I don't know if this only affects PoseNet models or is a more broad issue. Just wanted to bring it to your attention.

Originally posted by @glenvorel in https://github.com/patlevin/tfjs-to-tf/issues/13#issuecomment-666035809

patlevin commented 4 years ago

@glenvorel There are two problems with this model. The first problem is that the model is missing any meta data so the converter has to guess the signature.

In this particular case, guessing the signature failed due to the dynamic dimensions of the output, which the algorithm failed to detect. This is an easy fix, though and not much of an issue.

The second problem is a consequence of the first problem, however, in that meta data is not present. The model actually has four outputs, two of which are properly named. The converter cannot guess, which parts of the output you're interested in.

Just for completeness sake, the "human readable" names of the four outputs are:

Output Node Actual Name
float_heatmaps heatmaps
float_short_offsets offsets
resnet_v1_50/displacement_bwd_2/BiasAdd displacement_backward
resnet_v1_50/displacement_fwd_2/BiasAdd displacement_forward

side note I've been thinking about the ability to rename nodes for while now and this would provide for a great use case for this, IHMO. In order to keep the CLI utility as simple as possible, both output selection and node renaming would be limited to the API, though.

Nevertheless, I'll drop a new release later today that will fix the output sizes and -names.

glenvorel commented 4 years ago

Hi, I just tested v1.1.2 and it works very well, thank you for all the work!

I completely agree with the idea of being able to select output nodes, as I see you are working on for v 1.2.0. Some models are saved with very rich SignatureDefs and when served by TF Serving, the response size can be quite enormous. For example, some of the new models from TF2 Detection Model Zoo output significantly larger response than the image data on the input! Ironically, this shifts the inference bottleneck from GPU to network throughput.