datacamp / Applied-Machine-Learning-Ensemble-Modeling-live-training

Live Training Session: Applied Machine Learning:Ensemble Modeling
9 stars 9 forks source link

Notebook review #3

Open alexyarosh opened 4 years ago

alexyarosh commented 4 years ago

Hi @LisaStuart5678 !

Thank you for really exceptional work!

I know it might seem like giving too much detail, but this will be very invaluable for those students who work through the notebook on their own, or for students to refer to later when they are using their new skills in their own projects!

We have some general requests before the session:

As you prepare for the session...

you'll learn how to create a layer of baseline models, and using packages designed for model stacking, 
another layer to produce a final model with much better-than-baseline performance

I noticed that the models in the notebook don't give a better performance, and I realize that it might be too late to change that. Using a different example (or another example, in addition to already existing) would be ideal, but If finding a dataset/model that gives better performance is not possible, I suggest at least going into more detail about possible model improvements. The "Final observation" section is a good start, but I think the learners might find it a bit unsatisfactory. Of course, using

General comments on the notebook

I fixed a few formatting issues and typos, and saved the notebook as Applied_Machine_Learning_Ensemble_modelling-solution.ipynb. Please use that file from now on to make changes.

This will help you plan the session better and avoid going too long without Q&A. This also helps students because they will see that they will have their questions answered soon. Please feel free to look at examples in notebooks for our past trainings!

Just a couple of sentences would be enough. A good place to do this is either in the very beginning of the session, or before "Getting started with Stacking Classifier", For example, the first video in Ch4 of in Ensemble Models in Python provides "intuition" behind stacking models. I think students would really love to hear your take on this! In addition to a brief explanation, this would be a good place to include a visualization. You have a great visualization in the "Double stacking" section. I'd love to see something similar for one-layer model! E.g. here's a picture from the video I referenced: image

This includes parameter names, possible parameter values, etc. I think I got most of these when I was going through the notebook, but double-check!

Specific comments

Stacking classifier

When creating X and y...

It's never mentioned explicitly which variable of the dataset we're going to predict

In "Creating a Naive Classifier"...

In "Custom function # 1: get_stacking()":...

I think this could really be subsumed by the first general: if you include a brief overview of principles of stacking, this will be clear by the time we get to writing this custom function!

In "Custom function # 3: evaluate_model(model):"...

In "Evaluate the models and store results"

Stacking regressor

Questions as a learner

Finally, below are some questions that I personally as a student in this training would have. Whether you'd like to address them in the notebook, or verbally when conducting the training, or at all, is totally up to you!


Thank you again for excellent work! Please let me know if you have any questions!

LisaStuart5678 commented 4 years ago

Hi Alex,

I've been working on the components for this Live Training as much as time has allowed and want it to be really high quality. Unfortunately, I've not been feeling well the last couple of days and so haven't been able to apply your feedback to the degree that I feel it needs. I'm so sorry, but I really feel like it's best to postpone this training in order to give the student learners the best experience possible.

Also, the links in the slide deck template are super helpful as a guide, but I cannot seem to find the Live Trainings that go along with them so that I get a better idea of how other instructors are handling going back and forth between showing slides and the notebook during the session. I'm sure Adel's Cleaning Data in Python is fantastic, but no matter how I search for it cannot locate the Live Training that he did. I only see the actual course. I thought that Kelsey stated that, "All live courses can be found here: https://datacamp.com/courses"? Can you please point me in the right direction?

Thank you so much and I sincerely apologize for any inconvenience. However, I really and truly want to give the preparation for this Live Training as much time as it needs and I just don't think that I can between now and this Thursday given how crappy I'm feeling at the moment.

Warmest Regards, Lisa Stuart MIT Certified Professional Data Scientist lisa5678@uw.edu 206-399-2681

On Fri, Jul 10, 2020 at 3:47 PM Alex Yarosh notifications@github.com wrote:

Assigned #3 https://github.com/datacamp/Applied-Machine-Learning-Ensemble-Modeling-live-training/issues/3 to @LisaStuart5678 https://github.com/LisaStuart5678.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/datacamp/Applied-Machine-Learning-Ensemble-Modeling-live-training/issues/3#event-3535670252, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACACGR2RT3CZUQK567ZVWCLR26KZDANCNFSM4OW7SQ5Q .

alexyarosh commented 4 years ago

Hi @LisaStuart5678 , I'm very sorry to hear that you aren't feeling well! It's no inconvenience at all. I believe you just received an email from Kelsey about rescheduling the training, please let me know if you have any questions!

Here are a few links to our previous live trainings. They contain the student and solution notebooks, and the recording of the session.

https://www.datacamp.com/resources/webinars/live-training-cleaning-data-in-python https://www.datacamp.com/resources/webinars/machine-learning-with-scikit-learn https://www.datacamp.com/resources/webinars/brand-analysis-using-social-media-data-in-r https://www.datacamp.com/resources/webinars/time-series-analysis-in-python https://www.datacamp.com/resources/webinars/machine-learning-with-xgboost

Please let me know if I can help with anything, and I hope you get better soon!