jina-ai / jina

☁️ Build multimodal AI applications with cloud-native stack
https://docs.jina.ai
Apache License 2.0
20.99k stars 2.22k forks source link

add spectrogram based encoder for audio #597

Closed anish2197 closed 4 years ago

anish2197 commented 4 years ago

Describe the feature A CNN encoder which extracts features from images of spectrograms (spectrograms usually give a more complete representation for non-speech sounds as compared to mel-features).

Your proposal Use the scipy.signal.spectrogram to generate the image of spectrogram and use a pre-trained Keras/Pytorch CNN to extract the features.

Is it a good idea to spend time implementing this?

alexcg1 commented 4 years ago

Hey Anish, thanks for the issue submission. The team'll look into it and get back to you soon. Extracting features from images of spectograms sounds pretty kick-ass to me :smile:

JoanFM commented 4 years ago

Hey @anish2197

The idea sounds really interesting, I think it would make a lot of sense to add it as a hub image at jina-hub.

You can see examples and a guide on how to create them a thttps://github.com/jina-ai/jina-hub

anish2197 commented 4 years ago

Thanks Joan. I'll look into the guide and try to implement this on jina-hub.

jina-bot commented 4 years ago

This issue is stale because it has been open 20 days with no activity. Remove stale label or comment or this will be closed in 4 days