umyelab / LabGym

Quantify user-defined behaviors.
GNU General Public License v3.0
64 stars 5 forks source link

issue about increase trained categorizer performance #104

Closed fengaccoumt closed 5 months ago

fengaccoumt commented 6 months ago

Hi good afternoon! Very thanks for your help before.

Due to the trained categorizer accuracy is 0.91 last time, I used it to test real video. The result shows specific behavior count is 7, but in fact the count is 1 in video. So I want to know:

  1. why this count difference happened, what I can do to increase the trained categorizer's performance?
  2. if we need other standard to judgement the trained categorizer's performance, except accuracy, precision, recall and f1-score?
  3. if I need convert trained report to confusion matrix to judgement the trained categorizer's performance?

This is categorizer's training report: image

This is categorizer's testing report: image

Very much looking forward to your reply!

yujiahu415 commented 6 months ago

Hi,

  1. This happens when your training examples haven't covered the majority of scenarios in your "real video". The testing report showed good accuracy because the training examples covered the majority of scenarios in the testing examples. In general, the Categorizer may be trained well for a particular set of scenarios, but may not generalize to more diverse scenarios. To address this, increase the diversity of your training examples to cover more scenarios. You can also increase the diversity of the amount of your testing examples. Currently you only have 20 examples per category for testing and I have no idea how you selected these 20 testing examples.
  2. and 3. I think the real problem is that you need more statistic power to reach a conclusion whether the Categorizer was trained well or not. You tested only 20 examples in testing, and sampled only 1 behavior that occurred only once in your "real video".

I also suggest you to look at other metrics as well, such as "duration" of the behavior, which is less sensitive to false categorization, than "count". LabGym categorizes behaviors at every frame, if a behavior "x" lasts for 10 frames, like "xxxxxxxxxx", the count of behavior "x" is 1, the duration is 10 frames, but if false categorization "y" happens in middle of the 10 frames, like "xxyxxxyxxx", the count of behavior "x" becomes 3 while the duration is 8 frames.

fengaccoumt commented 6 months ago

Thanks for your advises, I want to generate the confusion matrix according training report, but I don't know how to do it.

I calculate it by these formulas and the value from training report. This operate is right?

The formula is: image https://nonmeyet.tistory.com/entry/Confusion-matrix%EC%99%80-Precision-Recall-F1score%EC%9D%98-%EC%9D%B4%ED%95%B4

This is the value from training report: such as the model aa, Precision=0.9; recall=1; f1-score=0.95; accuracy=0.91 image

This is calculate process: 71fa7a4e2d765c8c44f4ad8a187518f 04eb50fed3fcfd8aada45f5050d2955

Very much looking forward to your reply!

yujiahu415 commented 6 months ago

Hi, I'm not sure whether I fully understand your issue / question. Were you trying to use LabGym to generate confusion matrix? The current version of LabGym doesn't output a confusion matrix but output a summary of precision, recall, and f1 score.