transfer learning with audio model

sameermahajan commented 2 years ago

In your

https://codelabs.developers.google.com/codelabs/tensorflowjs-audio-codelab/index.html#7

do you intend to create a brand new model and train it from scratch or do you intend to apply transfer learning leveraging already pretrained speech-commands audio model?

rthadur commented 2 years ago

@sameermahajan could you please elaborate more feature request.Thank you

sameermahajan commented 2 years ago

@rthadur let me try to explain my feature request a bit more. It is on the documentation and sample code side. As I understand the sample code posted here it shows using a pretrained model for making predictions. Later it shows how to create a brand new model and train it from scratch for making predictions. What I am suggesting is that you load the pretrained model, open it up at one of the layers latter during the pipeline, extend it with N (=number of classes in your example) class classification / softmax layer and then retrain it for the purpose of the example. It will make use of transfer learning, pre learnt weights and mostly have better accuracy than training brand new model from scratch.

This is something like what I have done here https://github.com/sameermahajan/Tensorflow.js on mobilenet model. The code is in 'train' function in index.js

rthadur commented 2 years ago

cc @ahmedsabie @pyu10055

jasonmayes commented 2 years ago

I did not write this codelab but after taking a quick look it does indeed seem to be training a whole new model from what I can tell vs (given the title) a transfer learning approach which from my understanding would chop the model at some layer to get features and repurpose those to classify new things (much like how I do here in my updated teachable machine demo for images https://codelabs.developers.google.com/tensorflowjs-transfer-learning-teachable-machine#0 ).

@dsmilkov @nsthorat could you possibly shed some light here for the audio codelab you wrote where the transfer learning part actually is? I see you load the audio recognition model in step 3, use it to predict in step 4 / 5, then in step 6 you collect fresh data (spectrograms) - my guess is this is where you are making use of something in the prior model, but I don't see the model chopped up like I would expect for a transfer learning example to grab features based on previous sounds it knows how to classify and build upon that knowledge, and after that step 7+ it seems to be training a whole new CNN from zero.

sameermahajan commented 1 year ago

@nsthorat @dsmilkov any update on this? If you give me some code pointers to how to do transfer learning (especially at which layer and how to chop the model and stitch it to the last softmax layer) I can take a crack at developing the complete example to share that you can take forward as you see fit. @rthadur

jasonmayes commented 1 year ago

@sameermahajan I now have a course on Google Developers that explains how to do transfer learning with TensorFlow.js using Web ML in one of the chapters. You can find it here: https://goo.gle/Learn-WebML

My example covers image recognition but sound classification is actually an image problem in some cases as you are just classifying the spectrogram of the audio so you should be able to use that as inspiration :-)

tensorflow / tfjs

transfer learning with audio model #6342