Closed JackieWang9811 closed 8 months ago
hi there,
Thanks so much, just updated.
Please see https://github.com/YuanGongND/cav-mae/blob/master/src/preprocess/sample_video_extract_list.csv
It is simple but with the attached video files, you can have a quick try (i.e., you do not need to find external video files).
-Yuan
please let me know if it does not work.
please let me know if it does not work.
Hi,Dr Gong! Thanks a lot for quick reply, it works . Here i have anthor question, about the Table 16 in paper,
what confuses me is, after shuffling the matching pairs, the results of CVA-MAE and AV-MAE are the same .The mismatching pairs will inevitably collapse the objective of contrastive learning to a certain extent. From my subjective point of view, why are the results of CVA-MAE not worse?
Is there a good explanation for this result? Thanks a lot!
thanks, it is a good catch.
Section J and table 16 are in appendix but took us a lot of time to do it. The results convey a lot of information, including the fact that AV-MAE itself does not learn audio-visual correspondence (with our architecture, it might work in other architectures).
The mismatching pairs will inevitably collapse the objective of contrastive learning to a certain extent.
This is true. But the results are joint a-v classification with full finetuning. Note, full finetuning could override a lot.
Everything we think that we can confident to say about is in Section J. Table 16 honestly shows what we got from the experiments.
It would be nice to have an independent issue for different questions as it would make people easier to search.
-Yuan
Hi,Dr Gong!
Thanks for release the code of CAV-MAE. It's not difficult to see that it is a great work! My only question is, what you mentioned in the README
_Both scripts are simple, you will need to prepare a csv file containing a list of video paths (see for an example
src/preprocess/sample_video_extract_list.csv
)_ but I haven't found it. For those who are first to this field, it may be difficult to follow.Can you upload it? Thanks a lot!