Closed qualiaMachine closed 3 months ago
@qualiaMachine yes would be nice to include an (optional) episode on transfer learning. In practice we always discuss it, so it would be good to also have some material for it. It would be amazing if you can set something up, checkout the curriculum development handbook for inspiration on how to make a pedagogically good episode.
I'll see what I can do! Thanks for the helpful link.
I recently used this guide: https://medium.com/@kenneth.ca95/a-guide-to-transfer-learning-with-keras-using-resnet50-a81a4a28084b for transfer learning with resnet with keras. I think it is straightforward and also uses the CIFAR10 data-set for fine-tuning which fits very well with episode 4. We could envision developing some optional material with this.
@qualiaMachine did you have any thoughts (or dreams?) about this recently? If not, we might pick it up.
@svenvanderburg Sadly I have not had the time to pick this up. If you end up forming a small team, I wouldn't mind sitting in on some meetings to help.
Check, probably if we pick it up. Someone will work on it individually and submit a PR, which you can then of course provide input on.
An nice blog on transfer learning with cifar10 data-set that touches upon most tricky things that can happen with transfer learning. Thinking of modelling the lesson on this. Any thoughts? @svenvanderburg, @qualiaMachine
@cpranav93 it's for medium members only, but judging from the title and intro it looks like a good base for an episode!
Here's a notebook version of what the episode would roughly do. To be improved and annotated and simplified. transfer_learning_notebook
@cpranav93 yes I love it! A very powerful example of the power of transfer learning I think. The code is quite simple (someone who did the course would have no problem at all implementing this) and the accuracy increases from around 55% in episode 4 to 80% with this approach. Only question I have: how fast does this train on a regular CPU, and is that a feasible training time for live coding?
For training 10 epochs with 1000 training examples and 100 validation samples, it takes approximately 4 mins on my machine which is reasonable I think. optionally, we could still simplify the additional layers further or reduce samples if needed.
Cool! That's indeed reasonable, but it would be good to mention that it can take long. I think you have a relatively powerful CPU, sometimes students only have access to pretty old/slow CPU machines.
I see there's an old closed issue related to this topic — #46. I would strongly argue in favor of adding transfer learning to these materials. Transfer learning is extremely common in research applications of deep learning because many domains lack sufficiently large datasets. Happy to try to get something started in a new (perhaps optional) episode or as an extension of "advanced layer types".