The videoclip checkpoint (https://dl.fbaipublicfiles.com/MMPT/retri/videoclip/checkpoint_best.pt) was tested on the action segmentation task (coin, testing split), the printed result is :frame accuracy: 0.5891579068298007, same as the paper reported. But when i debug in the output of COINZSPredictor, i found all the predicted label were '0', but label-0 was the negtivate/background class, the model predict nothing. Anyone has the same problem?
The videoclip checkpoint (https://dl.fbaipublicfiles.com/MMPT/retri/videoclip/checkpoint_best.pt) was tested on the action segmentation task (coin, testing split), the printed result is :frame accuracy: 0.5891579068298007, same as the paper reported. But when i debug in the output of COINZSPredictor, i found all the predicted label were '0', but label-0 was the negtivate/background class, the model predict nothing. Anyone has the same problem?
followings are settings i used to test. S3D config file: example/MMPT/project/retri/videoclip/test_coin_zs.yaml videoclip checkpoints: https://dl.fbaipublicfiles.com/MMPT/retri/videoclip/checkpoint_best.pt S3d checkpoint: https://github.com/antoine77340/S3D_HowTo100M s3d feature extractor: example/MMPT/scripts/video_feature_extractor/extract.py predictor: examples/MMPT/mmpt_cli/predict.py