qqwweee / keras-yolo3

A Keras implementation of YOLOv3 (Tensorflow backend)
MIT License
7.14k stars 3.45k forks source link

Getting features of three output layers of YOLO #529

Open Jichen66 opened 5 years ago

Jichen66 commented 5 years ago

Hi, I am now doing the thesis about the YOLO. I want to get all the features of the three output layers of YOLO when giving the images into it, and the size of y1,y2,y3 should be (sample_num, 13, 13, num_anchors(5 + num_classes)=255(default)),(sample_num, 26, 26, num_anchors(5 + num_classes)=255(default),(sample_num, 52, 52, num_anchors*(5 + num_classes)=255(default) separately. However after testing for one image, I got following output, which didn't show the exact shape of the output layers.

`model_data/yolo.h5 model, anchors, and classes loaded. (416, 416, 3) Found 3 boxes for img ('dog 0.99', (128, 224), (314, 537)) ('truck 0.91', (475, 85), (689, 170)) ('bicycle 0.99', (162, 119), (565, 441)) 4.62596893311

Tensor("input_1:0", shape=(?, ?, ?, 3), dtype=float32) [<tf.Tensor 'conv2d_59/BiasAdd:0' shape=(?, ?, ?, 255) dtype=float32>, <tf.Tensor 'conv2d_67/BiasAdd:0' shape=(?, ?, ?, 255) dtype=float32>, <tf.Tensor 'conv2d_75/BiasAdd:0' shape=(?, ?, ?, 255) dtype=float32>]`

I got stuck in this problem for several days. Can anybody give some advice on how to solve this problem? Many thanks in advance!!!

My code is following:

`import numpy as np import keras.backend as K from keras.models import load_model from keras.layers import Input from yolo import YOLO from PIL import Image import tensorflow as tf

weights_path='/home/jichen/keras-yolo3/model_data/yolo.h5' image_path = '/home/jichen/Desktop/dog.jpg' image = Image.open(image_path) yolo_guo = YOLO() Model = yolo_guo.detect_image(image, is_tsne=True) sess=tf.Session() print Model.input print Model.output`

and for function detect_imiage, I want to return the model of yolo, so just modify a little bit: if is_tsne: return self.yolo_model else: return image

Kot-Nak8 commented 4 years ago

I may be interpreting your problem incorrectly, but I will report my results.

The code I've rewritten looks like this.

def init(self, **kwargs): self.dict.update(self._defaults) # set up default values self.dict.update(kwargs) # and update with user overrides self.class_names = self._get_class() self.anchors = self._get_anchors() self.sess = K.get_session() self.boxes, self.scores, self.classes ,self.out_model= self.generate()

def generate(self): .... .... .... return boxes, scores, classes, self.yolo_model.output

def detect_image(self, image): .... .... .... out_boxes, out_scores, out_classes, out_model = self.sess.run( [self.boxes, self.scores, self.classes, self.out_model], feed_dict={ self.yolo_model.input: image_data, self.input_image_shape: [image.size[1], image.size[0]], K.learning_phase(): 0 }) .... .... .... return image, out_model

The code I ran is as follows

import numpy as np import keras.backend as K from keras.models import load_model from keras.layers import Input from yolo import YOLO from PIL import Image import tensorflow as tf

image_path = '/Users/Users/Documents/keras-yolo3-master/test.png' image = Image.open(image_path) image_output , Model_output = YOLO().detect_image(image) print(Model_output)

The results were as follows

Found 5 boxes for img dog 1.00 (16, 204) (177, 487) truck 0.97 (327, 78) (504, 152) car 0.11 (325, 86) (510, 155) bicycle 0.33 (5, 122) (137, 338) bicycle 0.99 (0, 109) (426, 399) [array([[[[-7.09539771e-01, 4.01784360e-01, 1.45798981e-01, ..., -5.06193352e+00, -5.41232395e+00, -5.28670883e+00], [ 1.30866611e+00, 1.45839393e+00, -2.95680016e-02, ..., -5.93569422e+00, -7.04240608e+00, -7.55034590e+00], [ 3.97958338e-01, 1.28708482e+00, 2.58324027e-01, ..., -7.73941612e+00, -8.17567921e+00, -8.91163921e+00], ..., [ 1.40693784e-01, -5.44680655e-01, 4.12246287e-01, ..., -8.56248188e+00, -8.92824364e+00, -1.04769955e+01], [-8.13400507e-01, -1.18537128e+00, 2.49891207e-02, ..., -4.21445274e+00, -6.39804935e+00, -7.25709629e+00], [ 3.22407544e-01, -5.17006218e-01, 4.55918819e-01, ..., -3.19745970e+00, -4.74484110e+00, -4.39262247e+00]]]], dtype=float32), .... .... .... .... ....

I would like to get a feature map as well, but I don't know how to go about it from here. I don't know if my response is correct, but I hope it helps.