carpentries-incubator / deep-learning-intro

Learn Deep Learning with Python
https://carpentries-incubator.github.io/deep-learning-intro/
Other
30 stars 36 forks source link

Teaching Intro to Deep Learning #303

Closed NidhiGowdra closed 1 year ago

NidhiGowdra commented 1 year ago

Hello All, Just wanted to make contact with the organizers/creators of the lesson as I would be teaching the lesson in March 2023 (We are still in the planning phase, timelines might vary) along with Mike Laverick and others.

svenvanderburg commented 1 year ago

That is great to hear @NidhiGowdra! We are really curious how the lesson material works out for you. Any feedback you have is very useful to us, also a few lines of comment in this issue. See also https://github.com/carpentries-incubator/deep-learning-intro/issues/178#issuecomment-1072453372 .

Let us know if you need any help in preparing for the course. We use these texts to advertise and communicate about the workshop, maybe they are useful for you as well?

mike-ivs commented 1 year ago

Thanks for the mention @NidhiGowdra, looking forward to getting stuck into this carpentries incubator!

@svenvanderburg for a bit more context we're looking at presenting a carpentry "intro to Python/ML/DL" workshop in late March 2023. Obviously the Python Lessons are well established, but were pretty excited to delve into the new ML/DL lessons in the carpentry incubators and help trial them out.

For our ML section we are considering to use the Intro to ML - SKLearn, which I can see you are somewhat familiar with [here and here] ;) . Likely we will try to develop this out further to fit into a broader "intro to Python/ML/DL" workshop/context and so I wanted to reach out given your presence in the area and, of course, your above comment! (we're only just jumping into the carpentries ML scene but keen to help build out and trial all the resources).

svenvanderburg commented 1 year ago

@mike-ivs that is great to hear! Good that you are teaching using this material and nice that you're keen on jumping into the carpentries ML scene, welcome 👋 If you are interested in developing the lesson further, we organize a lesson development sprint day on the 8th of march. (no is an answer of course 😋 )

I think there is a lot to choose from regarding ML lessons (as you already read in the issues you reference). Please note that we will trial with scikit learns' material in 2 weeks. I just added this: https://github.com/esciencecenter-digital-skills/lesson-machine-learning-intro/commit/a7ca3ddaa302023e21a596368d127113b1037c9f to the readme of that repo as we are currently not using nor developing it further. You can see our plans for teaching with the scikitlearn material here: https://esciencecenter-digital-skills.github.io/2023-01-30-ds-sklearn/ (checkout the syllabus/schedule for example). But I am also curious to see how the carpentries machine learning novice lesson works for you.

svenvanderburg commented 1 year ago

@mike-ivs and @NidhiGowdra how did it go? Do you have any feedback for the lesson?

NidhiGowdra commented 1 year ago

@svenvanderburg Apologies for my delayed response.

The course went well and we received valuable feedback from the cohort.

Some of the main points are:

  1. How to implement DL models for specific use cases (Admittedly, this is very hard to achieve in an intro lesson).
  2. I think reducing the number of datasets that were used in the lesson would have been better to explain the differences between the models/techniques applied. Ex: The penguins dataset and CIFAR-10 dataset perform the same classification task, they could have been merged. Applying MLP and CNN to the same CIFAR-10 dataset would have been helpful to show and explain how information flows within the model and how convolutions improve performance for image classification.
  3. I think it would have been better to organize the episodes based on task i.e. MLP classification -> CNN classification -> regression.

Technical issues:

  1. We tried to enforce a local install of the packages via anaconda/pip but there were issues around required permissions on university devices.
  2. We ended up utilizing google colab for the lessons.

Overall, I think it went well and I am keen to redo the lesson in H2-2023.

svenvanderburg commented 1 year ago

Great, thank you for your feedback @NidhiGowdra . And good to hear running the workshop went well. I will keep this issue open so we can think how to incorporate your feedback!

svenvanderburg commented 1 year ago

@NidhiGowdra sorry it took so long to get back to your feedback!

  1. For specific use cases/real-world examples I opened #362 . We usually demo some projects to meet this need of students and relate the whole lesson to real-world scientific research.
  2. From the beginning we decided to use different machine learning problems, because it effectively shows how to approach a deep learning problem 3 times. In the end we want students to apply deep learning to their own problem, and that is always a different dataset. We did notice that it was a bit time-consuming to introduce a dataset every time so we greatly reduced the time we spent on data exploration in #358 . I like your idea of doing MLP on CIFAR-10 first and then CNN to demonstrate the power of CNNs, I created #363 for it.
  3. I tend to disagree. I like the penguins dataset to introduce the topic of deep learning, but keeping things simple. Then the weather dataset in episode 3 allows the demonstration of model evaluation, monitoring and hyperparameter tuning, in other words it takes what we learned in episode 2 to the next step, the fact that it is a regression task is actually only a minor learning objective, in practice the approach does not matter that much. CNNs are definitely the hardest topic in this lesson, so I would still insist to put it last.

Regarding setup issues, indeed if students don't have the right permissions to install stuff, google colab is a good alternative.

Let me know what you think!

By the way, did you teach the lesson another time? If not, why not? (out of curiosity).

NidhiGowdra commented 1 year ago

@svenvanderburg No worries and thanks for the reply.

  1. I haven't worked on genomic data but you're right, We can show more examples of the DL projects we have worked on before the start of the lesson. We showcased a few projects but, I guess it was not broad enough. We will add more examples to the updated slides and discuss/highlight and show the real-world practical applications of DL for research/commercial projects.

  2. Agreed and Perfect. Yes the penguins dataset cements the idea of an abstract concept of "features". Harder to explain the same concept with image datasets.

  3. Fair enough, CNNs do need more time and require concepts from earlier lessons, we will keep the flow the same.

We are teaching the workshop again next month. We are finalizing the lesson plan and slide deck. We are also looking into maybe adding in LLMs (seeing its ever-growing popularity).

I will post back here after the workshop and collating feedback from students. Thanks

svenvanderburg commented 1 year ago

@NidhiGowdra thanks for your reply! Feel free to open up issues if you want to contribute any of your own material or see improvements to the lesson. Indeed LLMs are hard to ignore, I am curious to hear how teaching about it went!

I am closing this issue now, you can open a new issue for feedback from the next edition.