I looked at the caculateEvaluationCCC.py and I found something confusing.
Previously, the CCC was calculated for each video and the mean of CCCs can evaluate model performance, as denoted by following codes.
It's a little strange. Shouldn't CCC be calculated for each video and then be averaged over the validation set? And which method do you use in your baseline model evaluation?
I looked at the caculateEvaluationCCC.py and I found something confusing. Previously, the CCC was calculated for each video and the mean of CCCs can evaluate model performance, as denoted by following codes.
Now, the CCC seems to be calculated using all the utterances from validation set, without considering their corresponding videos.
It's a little strange. Shouldn't CCC be calculated for each video and then be averaged over the validation set? And which method do you use in your baseline model evaluation?
I'll be grateful to your reply.