descriptinc / audiotools

Object-oriented handling of audio data, with GPU-powered augmentations, and more.
https://descriptinc.github.io/audiotools/
MIT License
218 stars 37 forks source link

Add Whisper feature extraction methods #78

Closed iyaja closed 1 year ago

iyaja commented 1 year ago

This PR integrates adds methods to obtain Whisper input features, embeddings, and transcripts from an AudioSignal.

This new functionality allows developers to leverage the Whisper model for a wide range of audio processing tasks within the audiotools library.

sotelo commented 1 year ago

@pseeth could you also review this PR? There's things for which I lack context, for instance if we should implement this as a Mixin class?

sotelo commented 1 year ago

Thank you @pseeth !