About Data.py 11 tensor

KentChun33333 commented 7 years ago

Thanks for this awesome work! Just a little bit confused about the 11 tensors in data.py

new = [ [probs], [confs1], [confs2],
             [coord],[upleft], [botright], [proid],
             [conid1], [conid2], [cooid1], [cooid2] ]

probs seems to represent Pr(class | obj ) confs1 seems to represent the confidence in box 1 -- Pr(obj) confs2 seems to represent the confidence in box 2 -- Pr(obj)

According to the original paper, the rest of items seems going to represent [x,y,w,h] in Box1 and Box2. However, I have a hard time to figure out it.

Could you give me some hints? Thank you !

thtrieu commented 7 years ago

@KentChun33333 I am amazed at the fact you actually took the time to analyze my messy code! You are right about the first three upleft are upper left corner coordinates of bounding boxes botright are bottom right corner coordinates of bounding boxes So far, probs, confs1, confs2, upleft, botright constitutes the target of regression, why do we need the ___id tensors?

You know from the paper that only grid cells that are responsible for correct prediction are penalized (by an L2 loss), so not all entries in the above tensors should take part in the loss calculation, furthermore according to the paper, coordinates terms in the loss should be weighted more than the other terms, and of two boxes that each grid cell predicts, one with better IOU should be weighted differently than the other.

These __id tensors are meant to solve the above complication. They act as weights and will be set to appropriate value either in data.py (as numpy tensors, during the batch generating phase) or in tfnet.py (as tensorflow tensors, during the loss calculation phase). For example, if an entry should not affect the loss, its corresponding weight will be set to zero, if an entry correspond to coordinate loss, the weight should be 5.0, so on.

proid will weight probs, and its value is set in data.py
conid1 weights confs1 conid2 weights confs2 cooid1 weights coordinate of box1 cooid2 weights coordinate of box2

conid1, conid2, cooid1, cooid2's values are initialised in data.py and set to correct value in tfnet.py. Why? because we only know their correct value when IOU of each predicted box with the target are calculated, i.e. the forward pass must be done before this.

I hope you may come up with a better idea than mine so that the loss calculation becomes less of a pain. Cheers.

KentChun33333 commented 7 years ago

@thtrieu

Thank you for reply.

The explanation is as awesome as this work and solves my questiones and misunderstanding completely ! This is really a brilliant way to design tensors which act like weights that help to build the selective loss function. I wish I can really understand this amazing design as first ...

Thank you so ~ much : )

thtrieu / darkflow

About Data.py 11 tensor #3