YuanGongND / cav-mae

Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".
BSD 2-Clause "Simplified" License
223 stars 22 forks source link

Question Regarding stat calculation of dataset #24

Closed ben2002chou closed 8 months ago

ben2002chou commented 10 months ago

https://github.com/YuanGongND/cav-mae/blob/68fe8c2a3917dc2926e41f796bfdcb331a64b42c/src/dataloader.py#L87-L96

Hello, I would like some help regarding how can I get norm and std stats from audioset. I see this code here has some purpose in getting normalization stats, but it isn't very clear to me how I can get normalization stats following the comments.

YuanGongND commented 9 months ago

For Audioset, please just use our stats for both audio and image.

For other datasets, I believe it is also OK to use our stats unless it is really out of the domain (e.g., biomedical image).

-Yuan

YuanGongND commented 9 months ago

In you are interested in how it is calculated, please check https://github.com/YuanGongND/ast/blob/master/src/get_norm_stats.py

ben2002chou commented 8 months ago

Thank you!