Open zoecarver opened 5 years ago
I decided to only check during inferencing time. You might edit predict.py. yolo.predict() is called. From the returned boxes, just multiply by image size to convert from relative position to pixel locations. For the 'true' coordinates, I read them directly from the annotation file.
I too am confused by the encoding. It appears to be incorrect with respect to the original paper. However, I don't understand the implementation well enough to say that very strongly :| Plus the fact that this network works suggests that something is happening that I don't understand.
@robertlugg @zoecarver Now struggling with the exact same thing, anyone figured it out yet? @experiencor Can you explain the encoding and decoding? Whenever I put y_batch into the netout decoder my boxes are jacked up. And another question, the network output should be the same as ybatch right(from the batch generator)? And if not, what values should y_batch contain(xmin,ymin,xmax,ymax)?
I am wondering how to display the target data. The
BatchGenerator
does a lot of preprocessing to the data so when I try to load an element fromy_batch
the boxes are all messed up. My guess is that this is because of how it decodes the boxes (specifically these lines).Training still seems to work even though the target appears to be off, why is this? Also what is the best way to counter the decoding?