FrancoisLasson / Temporal_DBN

A Temporal Deep Belief Network implementation using Theano
2 stars 1 forks source link

How to calculate PER : Chunk #15

Open FrancoisLasson opened 8 years ago

FrancoisLasson commented 8 years ago

Perhaps, calculate PER using error on each frames is not a good solution. In her paper, Jost uses chunk principle with covering. That means, get average value on several frames. She sets 60 frames (30 fps with a kinect so 2 seconds of delay) by chunk with 30 frames of covering.

In our case, it may be a good idea to set size of chunks based on the number of past visibles layers and their actualization frequency.

FrancoisLasson commented 8 years ago

Recognition frame by frame:

Confusion matrix : [[ 166. 0. 2. 0. 0. 0. 0. 0. 0.] [ 31. 283. 0. 15. 2. 0. 0. 0. 0.] [ 45. 0. 259. 0. 0. 0. 4. 0. 0.] [ 0. 21. 0. 455. 1. 0. 11. 0. 0.] [ 0. 0. 0. 0. 363. 2. 25. 26. 58.] [ 50. 0. 3. 0. 72. 370. 6. 30. 21.] [ 0. 0. 0. 0. 1. 0. 440. 0. 0.] [ 0. 23. 14. 0. 0. 39. 0. 404. 17.] [ 0. 0. 0. 0. 56. 62. 0. 3. 359.]] Number of frames in the test dataset : 3739

->Analysis : Gesture label 0 (gesture1*.bvh): [[ 1.66000000e+02 2.00000000e+00] [ 1.26000000e+02 3.44500000e+03]] Nb frames : 292 Precision : 56.849315 % Recall : 98.809524 % Accuracy : 96.576625 % F_measure : 72.173913 %

Gesture label 1 (gesture2*.bvh): [[ 283. 48.] [ 44. 3364.]] Nb frames : 327 Precision : 86.544343 % Recall : 85.498489 % Accuracy : 97.539449 % F_measure : 86.018237 %

Gesture label 2 (gesture3*.bvh): [[ 259. 49.] [ 19. 3412.]] Nb frames : 278 Precision : 93.165468 % Recall : 84.090909 % Accuracy : 98.181332 % F_measure : 88.395904 %

Gesture label 3 (gesture4*.bvh): [[ 455. 33.] [ 15. 3236.]] Nb frames : 470 Precision : 96.808511 % Recall : 93.237705 % Accuracy : 98.716234 % F_measure : 94.989562 %

Gesture label 4 (gesture5*.bvh): [[ 363. 111.] [ 132. 3133.]] Nb frames : 495 Precision : 73.333333 % Recall : 76.582278 % Accuracy : 93.500936 % F_measure : 74.922601 %

Gesture label 5 (gesture6*.bvh): [[ 370. 182.] [ 103. 3084.]] Nb frames : 473 Precision : 78.224101 % Recall : 67.028986 % Accuracy : 92.377641 % F_measure : 72.195122 %

Gesture label 6 (gesture7*.bvh): [[ 4.40000000e+02 1.00000000e+00] [ 4.60000000e+01 3.25200000e+03]] Nb frames : 486 Precision : 90.534979 % Recall : 99.773243 % Accuracy : 98.742979 % F_measure : 94.929881 %

Gesture label 7 (gesture8*.bvh): [[ 404. 93.] [ 59. 3183.]] Nb frames : 463 Precision : 87.257019 % Recall : 81.287726 % Accuracy : 95.934742 % F_measure : 84.166667 %

Gesture label 8 (gesture9*.bvh): [[ 359. 121.] [ 96. 3163.]] Nb frames : 455 Precision : 78.901099 % Recall : 74.791667 % Accuracy : 94.196309 % F_measure : 76.791444 %

PER = 17.116876 %

Recognition using chunk :

chunk_size = 240 #number of frames in a chunk (Mocap frequency equals 120fps so, 240frames generates a delay of 2 seconds) chunk_covering = 230 #number of frames for covering (each 10 frames, we generate a new chunk)

Confusion matrix : [[ 6. 0. 0. 0. 0. 0. 0. 0. 0.] [ 0. 9. 0. 0. 0. 0. 0. 0. 0.] [ 0. 0. 4. 0. 0. 0. 0. 0. 0.] [ 0. 0. 0. 24. 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 26. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0. 24. 0. 0. 0.] [ 0. 0. 0. 0. 0. 0. 25. 0. 0.] [ 0. 0. 0. 0. 0. 0. 0. 23. 0.] [ 0. 0. 0. 0. 0. 0. 0. 0. 22.]] Number of chunks in the test dataset : 163

->Analysis : Gesture label 0 (gesture1*.bvh): [[ 6. 0.] [ 0. 157.]] Nb frames : 6 Precision : 100.000000 % Recall : 100.000000 % Accuracy : 100.000000 % F_measure : 100.000000 %

Gesture label 1 (gesture2*.bvh): [[ 9. 0.] [ 0. 154.]] Nb frames : 9 Precision : 100.000000 % Recall : 100.000000 % Accuracy : 100.000000 % F_measure : 100.000000 %

Gesture label 2 (gesture3*.bvh): [[ 4. 0.] [ 0. 159.]] Nb frames : 4 Precision : 100.000000 % Recall : 100.000000 % Accuracy : 100.000000 % F_measure : 100.000000 %

Gesture label 3 (gesture4*.bvh): [[ 24. 0.] [ 0. 139.]] Nb frames : 24 Precision : 100.000000 % Recall : 100.000000 % Accuracy : 100.000000 % F_measure : 100.000000 %

Gesture label 4 (gesture5*.bvh): [[ 26. 0.] [ 0. 137.]] Nb frames : 26 Precision : 100.000000 % Recall : 100.000000 % Accuracy : 100.000000 % F_measure : 100.000000 %

Gesture label 5 (gesture6*.bvh): [[ 24. 0.] [ 0. 139.]] Nb frames : 24 Precision : 100.000000 % Recall : 100.000000 % Accuracy : 100.000000 % F_measure : 100.000000 %

Gesture label 6 (gesture7*.bvh): [[ 25. 0.] [ 0. 138.]] Nb frames : 25 Precision : 100.000000 % Recall : 100.000000 % Accuracy : 100.000000 % F_measure : 100.000000 %

Gesture label 7 (gesture8*.bvh): [[ 23. 0.] [ 0. 140.]] Nb frames : 23 Precision : 100.000000 % Recall : 100.000000 % Accuracy : 100.000000 % F_measure : 100.000000 %

Gesture label 8 (gesture9*.bvh): [[ 22. 0.] [ 0. 141.]] Nb frames : 22 Precision : 100.000000 % Recall : 100.000000 % Accuracy : 100.000000 % F_measure : 100.000000 %

PER = 0.000000 %