Looking at Me Test Dataset Scores

fpusch commented 2 years ago

Training a model on the LAM train dataset prints evaluation inference results on the validation dataset like

mAP: 0.789340
TOP-1 Acc:92.725%
        LAM     NLAM
LAM     58.120  41.880
NLAM    3.815   96.185
mAP: 0.7893 best mAP: 0.7893

Running the actual tests against the test dataset (social_test) only returns a pred.csv but no further scores. Is this intentional? Specifically I am wondering because in my case using the reference gaze360 model results in a pred.csv with every line being score at <0.5 which I'd interpret as it as always predicting NLAM. This would not be consistent with the results shown in the paper. Am I missing something here?

I have seen #3 and the explanation that producing a gt.csv would compromise the other challenges. However without it, interpreting the applicability of the results onto the test dataset seems very difficult.

On a side note: the submission format linked from the README currently leads to a 404 page.

Thanks for any help :)

zcxu-eric commented 2 years ago

Hi fpusch,

For testset, we cannot release the gt labels, that are credential in this challenge. If you want to get the score for testset predictions, please submit it to the Looking-at-me challenge. The gaze360 is only used to initialize the backbone of our model, if you directly use it for testing, it will only output a constant score.

The submission guideline link has been updated.

Huhaowen0130 commented 2 years ago

Hi fpusch,

For testset, we cannot release the gt labels, that are credential in this challenge. If you want to get the score for testset predictions, please submit it to the Looking-at-me challenge. The gaze360 is only used to initialize the backbone of our model, if you directly use it for testing, it will only output a constant score.

The submission guideline link has been updated.

Hi, it seems that the submission guideline link here is 404 again, could you help with that? Thank you!

zcxu-eric commented 2 years ago

Hello, the EvalAI platform may change the pointers occasionally. We have updated the link. Next time if you happen to find the link not accessible, you could search our challenge directly on EvalAI. Thanks.

Huhaowen0130 commented 2 years ago

Hello, the EvalAI platform may change the pointers occasionally. We have updated the link. Next time if you happen to find the link not accessible, you could search our challenge directly on EvalAI. Thanks.

Thank you for your immediate reply! It says the results should be summitted as a json file, is there any standard code to convert pred.csv to json?

zephyrzhu1998 commented 2 years ago

Hello, the EvalAI platform may change the pointers occasionally. We have updated the link. Next time if you happen to find the link not accessible, you could search our challenge directly on EvalAI. Thanks.

Thank you for your immediate reply! It says the results should be summitted as a json file, is there any standard code to convert pred.csv to json?

Hi Huhaowen, Sorry, we don't provide this script.

Huhaowen0130 commented 2 years ago

OK, I understand it. Just to make sure, if the "uid" in the code represents the video id in json?

zcxu-eric commented 2 years ago

OK, I understand it. Just to make sure, if the "uid" in the code represents the video id in json?

correct

Huhaowen0130 commented 2 years ago

OK, I understand it. Just to make sure, if the "uid" in the code represents the video id in json?

correct

Hello Eric, sorry for bothering you again!

I submitted a json file and there occurred an error when calculating the results:

There seems something wrong with my json. Could you help check it? Thank you in advance!

zcxu-eric commented 2 years ago

OK, I understand it. Just to make sure, if the "uid" in the code represents the video id in json?

correct

Hello Eric, sorry for bothering you again!

I submitted a json file and there occurred an error when calculating the results:

There seems something wrong with my json. Could you help check it? Thank you in advance!

"label" should always be "1", and "score" denotes the probability for being predicted as "1"

Huhaowen0130 commented 2 years ago

"label" should always be "1", and "score" denotes the probability for being predicted as "1"

Hi! Sorry for disturbing you again.

I got a reasonable result (similar to the Ego4D paper) by doing as you said. But when I checked the dataset, I found something weired: in line 198 of data_loader.py, I saw this message being printed many times. May I ask if it's normal? Does it mean the faces in some frames are not tracked? Thank you!

zcxu-eric commented 2 years ago

"label" should always be "1", and "score" denotes the probability for being predicted as "1"

Hi! Sorry for disturbing you again.

I got a reasonable result (similar to the Ego4D paper) by doing as you said. But when I checked the dataset, I found something weired: in line 198 of data_loader.py, I saw this message being printed many times. May I ask if it's normal? Does it mean the faces in some frames are not tracked? Thank you!

Yes, some faces are missed due to the sharp motion, detector failure, challenging lighting conditions, etc. You can disable it if it is annoying.

Huhaowen0130 commented 2 years ago

"label" should always be "1", and "score" denotes the probability for being predicted as "1"

Hi! Sorry for disturbing you again. I got a reasonable result (similar to the Ego4D paper) by doing as you said. But when I checked the dataset, I found something weired: in line 198 of data_loader.py, I saw this message being printed many times. May I ask if it's normal? Does it mean the faces in some frames are not tracked? Thank you!

Yes, some faces are missed due to the sharp motion, detector failure, challenging lighting conditions, etc. You can disable it if it is annoying.

Got it, thank you for quick response!

EGO4D / social-interactions

Looking at Me Test Dataset Scores #9