ddlBoJack / emotion2vec

[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation
650 stars 48 forks source link

ONNX? #55

Open altunenes opened 1 week ago

altunenes commented 1 week ago

I've been working with the emotion2vec model and trying to convert it to ONNX format for deployment purposes. The current implementation is great for PyTorch users, but having ONNX support would enable broader deployment options.

I tried converting the model using torch.onnx.export with various approaches:

Direct conversion of the AutoModel Creating a wrapper around the model components Implementing custom forward passes

Main challenges encountered:

Dimension mismatches in the conv1d layers Issues with the masking mechanism Difficulties preserving the complete model architecture Problems with tensor handling between components

Could you please provide guidance on the correct architecture for ONNX conversion Including an example of proper tensor dimensionality through the model? I have converted torch vision models to Onnx before, but the audio models seemed a bit complicated to me :/

thank you very much your work it works really nice!

also see: https://github.com/modelscope/FunASR/issues/1690

ddlBoJack commented 4 days ago

We did not provide onnx model. Welcome contribute :)