QUESTION: Feature extraction, CNN input / output

IcyFrequency commented 6 years ago

Hi,

I'm wondering about a few things that I find hard to understand:

Before running predictions on a input image, the features for the image has to be extracted for the model to detect resemblances to features for a certain class, right?

Where in the code are the features for a image extracted and are those features the class probabilities of 1000 classes from the ImageNet dataset?

Since I'm kind of new to CNN;

How are the images inputted into the Neural Network for this implementation and how are the color channels manipulated with regard to this?
What is the raw output of the network and how are they converted to bounding boxes?
How is the annotated boudning box taken into consideration when training the Neural Network? Are those part cropped out as entire images, or are the annotated boxes giving the overlapping grid on the image class probabilities?

Hope someone finds the time to answer these question :))

And HUGE thanks in advance! Really appreciate the help here.

rodrigo2019 commented 6 years ago

Where in the code are the features for a image extracted and are those features the class probabilities of 1000 classes from the ImageNet dataset?

backend.py, but you will not find the 1000 classes, because in this implementation the model heads are not used.

How are the images inputted into the Neural Network for this implementation and how are the color channels manipulated with regard to this?

It depends on wich backend are you using, you can check the manipulation in the normalize function from each backend model, these functions also are in the backend.py file

What is the raw output of the network and how are they converted to bounding boxes?

your answer is here

How is the annotated boudning box taken into consideration when training the Neural Network? Are those part cropped out as entire images, or are the annotated boxes giving the overlapping grid on the image class probabilities?

the images are not crop, the bound box are calculated using coordinates, maybe is better to you check the paper, it uses a special loss functions to compute based on "anchors"

IcyFrequency commented 6 years ago

@rodrigo2019

Thanks for your input. I'll leave this QUESTION open in any case someone wants to go more in-depth or until i explain it myself after a deeper dive.

Thanks

experiencor / keras-yolo2

QUESTION: Feature extraction, CNN input / output #346