Open T4phage76 opened 1 week ago
Thanks for the issue! I think that this possibly was supposed to go to the cochdnn repo (https://github.com/jenellefeather/cochdnn), and not this tensorflow cochleagram repo?
That said, the models that are in that repository are all trained on 2 second sounds, and the architecture and cochleagram is built for this. There are things you can do if you want activations for longer sounds (remove the end fully connected layer, and modify the cochleagram so that it can take in 10 second long inputs), but the predictions you get out won't be meaningful.
Hi Jenelle,
Thanks for your reply. Yes, this issue should go to the cochdnn repo (https://github.com/jenellefeather/cochdnn). I apologize for opening this in the wrong place. I probably had too many github pages at the same time. Should I open a new issue there and add this link?
Regarding the issue itself, I can make my stimuli under 2 seconds for sure. I probably will try modifying the model to get the activations (indeed what I want is the activations), but I'm not sure if it practically makes sense to use the activation of a 10-sec long audio data with this altered model since it was trained and tested to be behaviorally and neurally predictive only with 2-sec long audio.
Another way that might work for longer audio is to use /cochdnn/robustness/audio_models/kell2018.py, but loaded with weights from /kelletal2018/network/weights.
What do you think? Thank you so much!
All of the models in the cochdnn
repo are trained with 2 second sounds, so you will have the same 2 second issue with all of them. The activations for the convolutional layers are probably still fine with longer sounds, minus boundary handling changes from the convolutions, which could slightly change things. you could run some tests and see. Good luck!
Gotcha! Thanks!
Description
Hey, I'm using this model to process some of my customized audio samples. I had an issue (see below) when feeding this model with audio samples longer than 2 seconds (Fs = 20 kHz). If I manually select 40000 (2 sec * 20000 Hz) data points from my sample audio by sample_audio[:, 40000] and run with the 40000-data-point-long trunk, this model works. I'm wondering if this model allows only 2-sec-long audio clips? Is there any way I can use an audio clip of an arbitrary length? Thanks! The error report and my code are listed below.
The error
My implementation
THANK YOU SO MUCH!