keunwoochoi / kapre

kapre: Keras Audio Preprocessors
MIT License
922 stars 146 forks source link

output channel of melspectrogram #45

Closed kyungyunlee closed 5 years ago

kyungyunlee commented 5 years ago

Hi, I am trying to run a 1D Conv on a Melspectrogram with kapre, but it seems like the Melspectrogram layer assumes that 2D operation will be done subsequently by giving a 4D output. So at this moment I am removing the channel dim and swapping axes after the Melspectrogram layer. Any thoughts on allowing a 3D output with no channel dim and appropriate input dimension ordering for 1D operation for Keras? (or am I missing out on something from the doc..?)

Thanks :)

keunwoochoi commented 5 years ago

By 1D Conv you mean something like Sander did in his 2013/2014 papers? In such a case you can realise it by having a kernel which is as 'high' as the whole frequency axis.

kyungyunlee commented 5 years ago

Yes, I was referring to that 1D conv. Ok I guess you could do that or add a reshaping layer. I was wondering if there was an option to create a 3D output, but I can workaround. Thanks!