AIWintermuteAI / aXeleRate

Keras-based framework for AI on the Edge
MIT License
176 stars 71 forks source link

[MAIXPY]kpu: set_outputs arg value error: w,c,ch size not match output size #10

Closed gemaizi closed 4 years ago

gemaizi commented 4 years ago

I followed your GitHub operation and finally trained a 5 num class model which the map is 0.76, then i flash the you firmware maixpy.bin which you post in the another question, after that i flash the model to the k210 , when run the script racoon_decetor.py ,but get error [MAIXPY]kpu: set_outputs arg value error: w,c,ch size not match output size, error line is :

a = kpu.set_outputs(task, 0, 7,7,30) #the actual shape needs to match the last layer shape of your model(before Reshape)

is it the problem my model is 5 num class ? what does the number 0,7,7,30 mean? thanks!

AIWintermuteAI commented 4 years ago

The short answer: Capture Just copy these numbers from your training script output - this what I mean by "the shape of the last layer of your model(before Reshape)" . So, in case with 5 classes it should be kpu.set_outputs(task, 0, 7,7,50). The extended answer: If I would get 10 cents for every time the question about the meaning of mysterious numbers in detection layer of YOLO is brought up somewhere on the internet, I'd be pretty well off by now xD For some background on YOLO you can watch my video, which explains it in layman terms or read the article - the article also has a reference section, which included other more in-depth materials https://www.instructables.com/id/Object-Detection-With-Sipeed-MaiX-BoardsKendryte-K/ https://youtu.be/87c6dCgXeJo

To put it simply the output of YOLO v2 is a tensor with shape (batch_size, grid_size, grid_size, number of boxes, predictions for boxes). Grid size is determined by output of feature extractor. For each grid 5 boxes are predicted(by default it's 5) and each box produces the following predictions:

gemaizi commented 4 years ago

The short answer: Capture Just copy these numbers from your training script output - this what I mean by "the shape of the last layer of your model(before Reshape)" . So, in case with 5 classes it should be kpu.set_outputs(task, 0, 7,7,50). The extended answer: If I would get 10 cents for every time the question about the meaning of mysterious numbers in detection layer of YOLO is brought up somewhere on the internet, I'd be pretty well off by now xD For some background on YOLO you can watch my video, which explains it in layman terms or read the article - the article also has a reference section, which included other more in-depth materials https://www.instructables.com/id/Object-Detection-With-Sipeed-MaiX-BoardsKendryte-K/ https://youtu.be/87c6dCgXeJo

To put it simply the output of YOLO v2 is a tensor with shape (batch_size, grid_size, grid_size, number of boxes, predictions for boxes). Grid size is determined by output of feature extractor. For each grid 5 boxes are predicted(by default it's 5) and each box produces the following predictions:

  • parameters for the boundary box (x,y, w, h)
  • box confidence score (objectness)
  • class probabilities So, before reshape we have tensor of shape (batch_size, grid_size, grid_size, 5 (5 + number of classes)) - in your case number of classes is 5, so assuming you have 224224 input it will come out as (batch_size, 7, 7, 50). Case solved!

Very detailed explanation,after change num as you say, now the model work! thank you very much!!

AIWintermuteAI commented 4 years ago

Okay, great!