TTM: Audio Extraction Technique Mismatch between Ego4d paper and Baseline Implementation

EGO4D / social-interactions

MIT License

45 stars 8 forks source link

TTM: Audio Extraction Technique Mismatch between Ego4d paper and Baseline Implementation #23

Open hars-singh opened 1 year ago

hars-singh commented 1 year ago

Hi,

In the current implementation, raw audio is directly passed to resse network. While in ego4d paper, it is shown that first, MFCC features are extracted then it is passed to resse.

Which one is right?

Thanks