LanglandsLin / LanglandsLin.github.io

林里浪的个人主页
MIT License
1 stars 1 forks source link

About Accuracy and MS^2L Rand-Unsupervised #1

Open zilangch opened 3 years ago

zilangch commented 3 years ago

i read your paper and feel confused about how you calculate the accuracy of unsupervised model and what's exactly about the baseline model "MS^2L Rand-Unsupervised"? Looking forward to your reply

LanglandsLin commented 3 years ago

Hi @zilangch !

In the unsupervised setting, we train the encoder without the action labels and only with the self-supervised learning tasks. And we evaluate the feature representations extracted by the encoder with a supervised trained classifier. This classifier is trained with action labels on the features extracted by the encoder.

The baseline "MS^2L Rand-Unsupervised" means a random encoder. We exploit the encoder without any training to extract features.

The unsupervised approach means that we train the encoder in an unsupervised setting to extract the features and we evaluate the quality of the features by action recognition. However, some other downstream tasks can also be utilized for evaluation, such as action prediction.

Hope this is helpful.

zilangch commented 3 years ago

In your first paragraph, "This classifier is trained with action labels on the features extracted by the encoder", is that pseudo label or true label, if it is pseudo label, it is generated by encoder? I don't understand how pseudo label can verify encoder's performance.

Could you give an example with a jigsaw puzzle?

zilangch commented 3 years ago

Could unsupervised setting train the encoder and classifier jointly?

LanglandsLin commented 3 years ago

It is true labels. We employ the classifier to perform action recognition. Fig. 3 shows an example of jigsaw puzzles in the paper. We shuffle the action sequences in the temporal and apply the network to predict the way we shuffle.

In the unsupervised setting, it requires us to train the encoder without action labels. And we train the classifier with action labels. Thus, we can not train the encoder and classifier jointly in the unsupervised setting.