FrancoisLasson / Temporal_DBN

A Temporal Deep Belief Network implementation using Theano
2 stars 1 forks source link

Clean database #7

Open FrancoisLasson opened 8 years ago

FrancoisLasson commented 8 years ago

Have to clean dataset. In fact, as we could see in the confusion matrix in a comment of #1, gesture whose label is 6, increases PER. Yet, each gestures have bad recognition precision.

FrancoisLasson commented 8 years ago

In the first time, we have check all gestures contain in our database. In a second time, we removed every useless frames in theses gestures (all gestures began with waiting time, motionless skeleton). These frames was in all gestures and so, generate an instability in learning phase.

Done!

FrancoisLasson commented 8 years ago

Error in LogReg input have been corrected and confusion matrices are calculated without initialization frames and one gesture at time.

I) Old set of gestures :

test_files = ['data/geste1i.bvh',
              'data/geste2i.bvh',
              'data/geste3i.bvh',
              'data/geste4i.bvh',
              'data/geste5i.bvh',
              'data/geste6i.bvh',
              'data/geste7i.bvh',
              'data/geste8i.bvh',
              'data/geste10i.bvh']
test_labels = [0,1,2,3,4,5,6,7,8]

Learning phase: validation PER=24,15% ; test PER=24,5% per_crbm_1461123172 73

Recognition phase : Confusion matrix : [[ 188. 0. 0. 0. 1. 0. 10. 0. 0.] [ 18. 322. 1. 0. 0. 6. 0. 0. 0.] [ 47. 4. 235. 0. 4. 0. 10. 0. 0.] [ 16. 0. 20. 103. 5. 32. 2. 0. 0.] [ 0. 0. 0. 197. 320. 118. 33. 0. 27.] [ 13. 0. 14. 19. 73. 202. 25. 12. 15.] [ 10. 1. 8. 39. 31. 41. 84. 4. 9.] [ 0. 0. 0. 31. 7. 1. 0. 445. 14.] [ 0. 0. 0. 0. 54. 17. 4. 2. 390.]] Number of frames in the test dataset : 3284

->Analysis : Gesture label 0 (gesture1*.bvh): [[ 188. 11.] [ 104. 2981.]] Nb frames : 292 Precision : 64.383562 % Recall : 94.472362 % Accuracy : 96.498173 % F_measure : 76.578411 %

Gesture label 1 (gesture2*.bvh): [[ 322. 25.] [ 5. 2932.]] Nb frames : 327 Precision : 98.470948 % Recall : 92.795389 % Accuracy : 99.086480 % F_measure : 95.548961 %

Gesture label 2 (gesture3*.bvh): [[ 235. 65.] [ 43. 2941.]] Nb frames : 278 Precision : 84.532374 % Recall : 78.333333 % Accuracy : 96.711328 % F_measure : 81.314879 %

Gesture label 3 (gesture4*.bvh): [[ 103. 75.] [ 286. 2820.]] Nb frames : 389 Precision : 26.478149 % Recall : 57.865169 % Accuracy : 89.007308 % F_measure : 36.331570 %

Gesture label 4 (gesture5*.bvh): [[ 320. 375.] [ 175. 2414.]] Nb frames : 495 Precision : 64.646465 % Recall : 46.043165 % Accuracy : 83.252132 % F_measure : 53.781513 %

Gesture label 5 (gesture6*.bvh): [[ 202. 171.] [ 215. 2696.]] Nb frames : 417 Precision : 48.441247 % Recall : 54.155496 % Accuracy : 88.246041 % F_measure : 51.139241 %

Gesture label 6 (gesture7*.bvh): [[ 84. 143.] [ 84. 2973.]] Nb frames : 168 Precision : 50.000000 % Recall : 37.004405 % Accuracy : 93.087698 % F_measure : 42.531646 %

Gesture label 7 (gesture8*.bvh): [[ 445. 53.] [ 18. 2768.]] Nb frames : 463 Precision : 96.112311 % Recall : 89.357430 % Accuracy : 97.838002 % F_measure : 92.611863 %

Gesture label 8 (gesture9*.bvh): [[ 390. 77.] [ 65. 2752.]] Nb frames : 455 Precision : 85.714286 % Recall : 83.511777 % Accuracy : 95.676005 % F_measure : 84.598698 %

Conclusion :

As we could see in these confusion matrices, our high PER is due to gesture 4, 6 and 7 (label 3,5 and 6). Bad influence of the gesture 4 and 6 is due to their similarities with other gestures. As regards the gesture 7, its bad recognition is a result of its small number of frames. Moreover, gesture 7 is a kick, sometimes with left leg, sometimes with the right, thus significantly reducing number of instance for the learning phase. So, this gesture is disadvantaged compared to other. We will prefer using gesture 17 (left kick) or 18 (right kick).

II) New set of gestures :

test_files = ['data/geste1i.bvh',
              'data/geste2i.bvh',
              'data/geste3i.bvh',
              'data/geste13i.bvh',
              'data/geste5i.bvh',
              'data/geste14i.bvh',
              'data/geste18i.bvh',
              'data/geste8i.bvh',
              'data/geste10i.bvh']
test_labels = [0,1,2,3,4,5,6,7,8]

Learning phase: validation PER=12,57% ; test PER=6,05% per_crbm_1461252267 31

Recognition phase : Confusion matrix : [[ 166. 0. 2. 0. 0. 0. 0. 0. 0.] [ 31. 283. 0. 15. 2. 0. 0. 0. 0.] [ 45. 0. 259. 0. 0. 0. 4. 0. 0.] [ 0. 21. 0. 455. 1. 0. 11. 0. 0.] [ 0. 0. 0. 0. 363. 2. 25. 26. 58.] [ 50. 0. 3. 0. 72. 370. 6. 30. 21.] [ 0. 0. 0. 0. 1. 0. 440. 0. 0.] [ 0. 23. 14. 0. 0. 39. 0. 404. 17.] [ 0. 0. 0. 0. 56. 62. 0. 3. 359.]] Number of frames in the test dataset : 3739

->Analysis : Gesture label 0 (gesture1*.bvh): [[ 1.66000000e+02 2.00000000e+00] [ 1.26000000e+02 3.44500000e+03]] Nb frames : 292 Precision : 56.849315 % Recall : 98.809524 % Accuracy : 96.576625 % F_measure : 72.173913 %

Gesture label 1 (gesture2*.bvh): [[ 283. 48.] [ 44. 3364.]] Nb frames : 327 Precision : 86.544343 % Recall : 85.498489 % Accuracy : 97.539449 % F_measure : 86.018237 %

Gesture label 2 (gesture3*.bvh): [[ 259. 49.] [ 19. 3412.]] Nb frames : 278 Precision : 93.165468 % Recall : 84.090909 % Accuracy : 98.181332 % F_measure : 88.395904 %

Gesture label 3 (gesture4*.bvh): [[ 455. 33.] [ 15. 3236.]] Nb frames : 470 Precision : 96.808511 % Recall : 93.237705 % Accuracy : 98.716234 % F_measure : 94.989562 %

Gesture label 4 (gesture5*.bvh): [[ 363. 111.] [ 132. 3133.]] Nb frames : 495 Precision : 73.333333 % Recall : 76.582278 % Accuracy : 93.500936 % F_measure : 74.922601 %

Gesture label 5 (gesture6*.bvh): [[ 370. 182.] [ 103. 3084.]] Nb frames : 473 Precision : 78.224101 % Recall : 67.028986 % Accuracy : 92.377641 % F_measure : 72.195122 %

Gesture label 6 (gesture7*.bvh): [[ 4.40000000e+02 1.00000000e+00] [ 4.60000000e+01 3.25200000e+03]] Nb frames : 486 Precision : 90.534979 % Recall : 99.773243 % Accuracy : 98.742979 % F_measure : 94.929881 %

Gesture label 7 (gesture8*.bvh): [[ 404. 93.] [ 59. 3183.]] Nb frames : 463 Precision : 87.257019 % Recall : 81.287726 % Accuracy : 95.934742 % F_measure : 84.166667 %

Gesture label 8 (gesture9*.bvh): [[ 359. 121.] [ 96. 3163.]] Nb frames : 455 Precision : 78.901099 % Recall : 74.791667 % Accuracy : 94.196309 % F_measure : 76.791444 %

PER = 17.116876 %