earthspecies / audio-embeddings

7 stars 2 forks source link

pushed two refactored notebooks #5

Closed radekosmulski closed 2 years ago

radekosmulski commented 3 years ago

I pushed a refactored version of 02 and 03. 02 is a simpler setup where we are not doing anything too fancy (no normalization, no sampling of epochs in a specific way, no implementation of an RNN loop). 03 is a model that has been optimized for fast training and evaluation. It also incorporates a couple of more advanced techniques that I feel can help with training (but additional code is prone to bugs and we still have not demonstrated our morel can learn semantic features!)

If you have any thoughts on these two approaches, would be great to hear them 🙂. I am still looking for a way to train the models to improve on the semantic tasks. Especially in 03, there is a lot of looking at results - hoping to make progress on this within next couple of days but another set of eyes on this would be greatly appreciated 🙂.

bs commented 3 years ago

@radekosmulski, who's feedback would be most useful here?

radekosmulski commented 3 years ago

Good question - anyone who implemented a speech2vec model and actually trained it (but I think this probably makes for a very small group of people 🙂) , someone who worked with LibriSpeech and montreal forced aligner could also have some very interesting and valuable feedback