hi,sir.
I used vott to output a TFRecord file. After querying, I learned that the TFRecord file contains a key "image / encoded", I read the TFRecord file, and get its corresponding value through the following code.
image_feature_description = {
"image/encoded": tf.io.FixedLenFeature([], tf.string),#图像数据
}
def _parse_image_function(example_proto):
# Parse the input tf.Example proto using the dictionary above.
return tf.io.parse_single_example(example_proto, image_feature_description)
raw_dataset = raw_dataset.map(_parse_image_function)
for item in raw_dataset:
print(np.frombuffer(item['image/encoded'].numpy(),dtype=np.uint8).flatten().shape)
It has the following output:
(10187,)
I input 320,240 RGB images, the theoretical value should be (320,240,3). The output (10187,) is obviously unreasonable.
I refer to the code in https://github.com/microsoft/VoTT/blob/master/src/providers/export/tensorFlowRecords.ts. Through the line 61 code: const imageBuffer = new Uint8Array (arrayBuffer) ;, it seems that the image data is directly stored in "image / encoded". I can't learn about the problem with my code. Can you help me explain how to read the value in the "image / encoded" key?
Have a nice day!
hi,sir. I used vott to output a TFRecord file. After querying, I learned that the TFRecord file contains a key "image / encoded", I read the TFRecord file, and get its corresponding value through the following code.
It has the following output:
(10187,)
I input 320,240 RGB images, the theoretical value should be (320,240,3). The output (10187,) is obviously unreasonable. I refer to the code in https://github.com/microsoft/VoTT/blob/master/src/providers/export/tensorFlowRecords.ts. Through the line 61 code:
const imageBuffer = new Uint8Array (arrayBuffer) ;,
it seems that the image data is directly stored in "image / encoded". I can't learn about the problem with my code. Can you help me explain how to read the value in the "image / encoded" key? Have a nice day!