Open hugozanini opened 1 year ago
The non_max_supresion function should return you many fewer values, then you should use the scale_coords function so it will translate the coordinates from the shape of model input (640x640, 416x416, etc.) to your original image size.
You can read the predict.py or detect.py scripts and adapt the code to your needs, since these have a pipeline for predictions
I converted the Yolo-tiny model to tensorflow.js but I'm not being able to interpret the outputs.
When I run an image through the model, I got a response in the format
[1, 25200, 85]
, and iterating over the 25200 rows what I understood is that the first 4 items are the bounding box coordinates of the detection, the fifth item is the detection confidence and the next 80 rows indicate the confidence of every class. As the example below:I read the codes of
export.py
andutils/general.py
to try to understand themax_supression
logic and how to interpret the predictions but I didn't get it. I tried to use strategies of denormalization of the values using the original image shape and used the coordinates to get the center of the image, but any of these worked.Is there any documentation that I can refer to interpret the predictions properly?
Output example:
Array 0/25200: [ 3.660677909851074, 3.8960976600646973, 7.1445159912109375, 8.558195114135742, 0.000002291132886966807, 0.2992437183856964, 0.0032765434589236975, 0.02299974113702774, 0.002288553863763809, 0.0061205169185996056, 0.000405748636694625, 0.0007168060401454568, 0.006684356834739447, 0.010973624885082245, 0.010580179281532764, 0.0013355360133573413, 0.0024683668743819, 0.0015576096484437585, 0.016338417306542397, 0.06432975828647614, 0.002155845519155264, 0.002606399590149522, 0.008280608803033829, 0.024560092017054558, 0.011779602617025375, 0.008507341146469116, 0.0006727887666784227, 0.010439596138894558, 0.009805492125451565, 0.014551358297467232, 0.00901725422590971, 0.010406507179141045, 0.006617129780352116, 0.0035439676139503717, 0.005152086261659861, 0.020896468311548233, 0.006204261444509029, 0.04126130789518356, 0.027140766382217407, 0.003251225920394063, 0.0019718394614756107, 0.007059866562485695, 0.028940090909600258, 0.005898833740502596, 0.01423275750130415, 0.007057651877403259, 0.03938567265868187, 0.01166496705263853, 0.010686900466680527, 0.005906108301132917, 0.005354586057364941, 0.003930031321942806, 0.005226451903581619, 0.0004987830179743469, 0.007237072102725506, 0.01963111199438572, 0.006294747814536095, 0.0008835819317027926, 0.0004639460239559412, 0.0038057370111346245, 0.0016457928577437997, 0.0632367804646492, 0.0031223613768815994, 0.012071071192622185, 0.0007920170319266617, 0.0067767915315926075, 0.007115103304386139, 0.002724584424868226, 0.0012104857014492154, 0.001585118006914854, 0.0028675436042249203, 0.001451255171559751, 0.0055689564906060696, 0.0007458814070560038, 0.0007105154800228775, 0.000056244785810122266, 0.010288779623806477, 0.002680464880540967, 0.013829641975462437, 0.007938055321574211, 0.007399112917482853, 0.0017575552919879556, 0.0013826033100485802, 0.0002145568432752043, 0.0031385323964059353 ]