EGO4D / social-interactions

MIT License
45 stars 8 forks source link

TTM: Audio Extraction Technique Mismatch between Ego4d paper and Baseline Implementation #23

Open hars-singh opened 1 year ago

hars-singh commented 1 year ago

Hi,

In the current implementation, raw audio is directly passed to resse network. While in ego4d paper, it is shown that first, MFCC features are extracted then it is passed to resse.

image

image

Which one is right?

Thanks