nttcslab / byol-a

BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation
https://arxiv.org/abs/2103.06695
Other
205 stars 35 forks source link

About inference speed? #14

Closed xiao2mo closed 2 years ago

xiao2mo commented 2 years ago

Hi is there any inference speed evaluation? And how to deal with long audios in production? Many thanks for ur great work.

daisukelab commented 2 years ago

Hello @xiao2mo, thank you for your interest! We're sorry that we have not evaluated inference speed. As you would understand, this research work is learning general-purpose representation, which we think is a basic part of research rather than proceeding to the matured stage of production. To bring this work up to the level of production, I think we have many things to discuss, including how long to handle the input audio. Then let me expect, please try with your purpose and share your problem, and hopefully with a new paper that proposes solutions for the problem! :)

daisukelab commented 2 years ago

Closing for now, please reopen if you need more assistance.