ddlBoJack / emotion2vec

[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation
650 stars 47 forks source link

OOM while processing IEMOCAP dataset #7

Closed AirHorizons closed 10 months ago

AirHorizons commented 10 months ago

I was trying to create iemocap embedding on my own, but my GPU with 8GB memory gave me OOM from cuda. How much size do I need to process this?

ddlBoJack commented 10 months ago

Hi, for feature extraction using emotion2vec, the maximum GPU memory usage is only 2-3GB. You can check again to see if there is anything wrong.

AirHorizons commented 10 months ago

Good news I found my mistake when I put the audio file of the whole dialogue instead of each utterance. Thank you for assuring me to look for another way!