Closed aleflabo closed 1 year ago
I have attached the runs I made using the commands you proposed in the README. As can be seen, the fine-tuning one achieves the paper's results immediately, whereas the others remain far from them.
fine tuning
python tools/run_net.py \ --cfg configs/EPIC-Sounds/slowfast/SLOWFASTAUDIO_8x8_R50.yaml \ NUM_GPUS 2 \ OUTPUT_DIR /home/aleflabo/epic-kitchens/epic-sounds-annotations/src/output \ EPICSOUNDS.AUDIO_DATA_FILE /home/aleflabo/epic-kitchens/epic-sounds-data/EPIC_audio.hdf5 \ EPICSOUNDS.ANNOTATIONS_DIR /home/aleflabo/epic-kitchens/epic-sounds-annotations \ TRAIN.CHECKPOINT_FILE_PATH /home/aleflabo/epic-kitchens/epic-sounds-annotations/src/pretrained/SLOWFAST_EPIC_SOUNDS.pyth
from scratch
python tools/run_net.py \ --cfg configs/EPIC-Sounds/slowfast/SLOWFASTAUDIO_8x8_R50.yaml \ NUM_GPUS 2 \ OUTPUT_DIR /home/aleflabo/epic-kitchens/epic-sounds-annotations/src/output \ EPICSOUNDS.AUDIO_DATA_FILE /home/aleflabo/epic-kitchens/epic-sounds-data/EPIC_audio.hdf5 \ EPICSOUNDS.ANNOTATIONS_DIR /home/aleflabo/epic-kitchens/epic-sounds-annotations
linear probe
python tools/run_net.py \ --cfg configs/EPIC-Sounds/slowfast/SLOWFASTAUDIO_8x8_R50.yaml \ NUM_GPUS 2 \ OUTPUT_DIR /home/aleflabo/epic-kitchens/epic-sounds-annotations/src/output \ EPICSOUNDS.AUDIO_DATA_FILE /home/aleflabo/epic-kitchens/epic-sounds-data/EPIC_audio.hdf5 \ EPICSOUNDS.ANNOTATIONS_DIR /home/aleflabo/epic-kitchens/epic-sounds-annotations \ MODEL.FREEZE_BACKBONE True
Hi,
Correct, the pretrained weights are already fine-tuned on EPIC-SOUNDS. To train from the initial pretrained models, you can download pretrained SlowFast models (including VGG) from here (file name is SLOWFAST_VGG.pyth
on Dropbox). For SSAST you can download from here (file name SSAST-Base-Patch-400.pth
on Dropbox). I will upload our versions to Dropbox for convenience, but in the meantime, this is where you can access them from.
The "from scratch" and "linear probe" runs won't be correctly reproduced without the pretrained models. Once you have access to the files, attaching TRAIN.CHECKPOINT_FILE_PATH <path-to-SLOWFAST_VGG.pyth>
to your commands should fix it.
NOTE: The checkpoint loading in our code looks for a 'model_state' key in the checkpoint in order to properly load the weights. This is not present in the SSAST checkpoint from their GitHub (it's there in our version) so you will need to alter it slightly first with: ssast_ckpt = {'model_state': <loaded_ssast_checkpoint_file>}; torch.save(ssast_ckpt, <file-path>)
.
UPDATE: The ReadMe has now been updated with Dropbox links to the pretrained files that we used ourselves for SlowFast and SSAST
Thanks for your quick and accurate help! Just wanted to let you know that the links in the Readme seem to be broken at the moment.
Thank you for making me aware, the links should now be fixed!
Hi authors,
I'm trying to reproduce the results reported in Table 3 of the paper. The checkpoints you are linking in the repo are already the fine-tuned models on the EpicSounds dataset.
The commands present in the README for fine-tuning, training from scratch and training the linear probe need the checkpoint pre-trained on [VGG-Sound] for ASF and [AudioSet, LibriSpeech] for SSAST. Am I missing something?
Thank you, Alessandro Flaborea