Closed gemaizi closed 4 years ago
The short answer: Just copy these numbers from your training script output - this what I mean by "the shape of the last layer of your model(before Reshape)" . So, in case with 5 classes it should be kpu.set_outputs(task, 0, 7,7,50). The extended answer: If I would get 10 cents for every time the question about the meaning of mysterious numbers in detection layer of YOLO is brought up somewhere on the internet, I'd be pretty well off by now xD For some background on YOLO you can watch my video, which explains it in layman terms or read the article - the article also has a reference section, which included other more in-depth materials https://www.instructables.com/id/Object-Detection-With-Sipeed-MaiX-BoardsKendryte-K/ https://youtu.be/87c6dCgXeJo
To put it simply the output of YOLO v2 is a tensor with shape (batch_size, grid_size, grid_size, number of boxes, predictions for boxes). Grid size is determined by output of feature extractor. For each grid 5 boxes are predicted(by default it's 5) and each box produces the following predictions:
The short answer: Just copy these numbers from your training script output - this what I mean by "the shape of the last layer of your model(before Reshape)" . So, in case with 5 classes it should be kpu.set_outputs(task, 0, 7,7,50). The extended answer: If I would get 10 cents for every time the question about the meaning of mysterious numbers in detection layer of YOLO is brought up somewhere on the internet, I'd be pretty well off by now xD For some background on YOLO you can watch my video, which explains it in layman terms or read the article - the article also has a reference section, which included other more in-depth materials https://www.instructables.com/id/Object-Detection-With-Sipeed-MaiX-BoardsKendryte-K/ https://youtu.be/87c6dCgXeJo
To put it simply the output of YOLO v2 is a tensor with shape (batch_size, grid_size, grid_size, number of boxes, predictions for boxes). Grid size is determined by output of feature extractor. For each grid 5 boxes are predicted(by default it's 5) and each box produces the following predictions:
- parameters for the boundary box (x,y, w, h)
- box confidence score (objectness)
- class probabilities So, before reshape we have tensor of shape (batch_size, grid_size, grid_size, 5 (5 + number of classes)) - in your case number of classes is 5, so assuming you have 224224 input it will come out as (batch_size, 7, 7, 50). Case solved!
Very detailed explanation,after change num as you say, now the model work! thank you very much!!
Okay, great!
I followed your GitHub operation and finally trained a 5 num class model which the map is 0.76, then i flash the you firmware maixpy.bin which you post in the another question, after that i flash the model to the k210 , when run the script racoon_decetor.py ,but get error [MAIXPY]kpu: set_outputs arg value error: w,c,ch size not match output size, error line is :
a = kpu.set_outputs(task, 0, 7,7,30) #the actual shape needs to match the last layer shape of your model(before Reshape)
is it the problem my model is 5 num class ? what does the number 0,7,7,30 mean? thanks!