Open marcverhagen opened 11 months ago
Here is an example of some of the predictions from the frame classifier.
Run 1:
<Prediction 20000 0.0000 0.0023 0.0034 0.9942>
<Prediction 22000 0.0000 0.6198 0.2866 0.0936>
<Prediction 24000 0.0000 0.8333 0.1138 0.0529>
<Prediction 26000 0.0000 0.9845 0.0043 0.0112>
<Prediction 28000 0.0000 0.0000 0.0000 1.0000>
Run 2:
<Prediction 20000 0.0000 0.0000 0.0000 1.0000>
<Prediction 22000 0.0000 0.6566 0.2176 0.1259>
<Prediction 24000 0.0000 0.9012 0.0368 0.0620>
<Prediction 26000 0.0000 0.6986 0.2043 0.0972>
<Prediction 28000 0.0000 0.0000 0.0000 1.0000>
In this case the differences are not large and in both runs we end up with a TimeFrame from 22000 through 26000 albeit with different scores.
I think this is expected as neither trainer nor classifier is implemented in the torch "deterministic mode". We can experiment with adding the mode setter in the classifier code and see if that works...?
When I run v3.0
20 times on the same video and grep TimeFrame
annotations from the output MMIFs;
# process_swt30.py "summarizes" MMIF into start, end, and frame type of each TimeFrame annotation into a line
$ for mmif in c53*.mmif; do py process_swt30.py $mmif | wc -l ; done
19
17
15
17
16
16
14
18
19
17
22
17
18
15
18
16
20
16
15
20
17
# getting sort_uniq
$ for mmif in c53*.mmif; do py process_swt30.py $mmif | wc -l ; done | sort -u
14
15
16
17
18
19
20
22
, the numbers of detected "relevant" time frames are quite different ranging 14-22.
Because
In the tip of the refactor-feature-extraction branch you get different image classification results if you run the classifier from the app repeatedly. The problem is probably in one of two lines:
https://github.com/clamsproject/app-swt-detection/blob/6ab33ba3df5d86cc694fa3fd097783501dce02f6/classify.py#L77-L78
This needs some more poking.