gimseng / 99-ML-Learning-Projects

A list of 99 machine learning projects for anyone interested to learn from coding and building projects
MIT License
576 stars 174 forks source link

[EXE] Learn regularizations #96

Open gimseng opened 3 years ago

gimseng commented 3 years ago

Learning Goals

Learn different methods of regularizing the models. This could be as basic as the L1, L2 (or ridge/Lasso) regularization or more sophisticated ones in other methods (like regularization hyperparameters in SVM) or even dropouts in neural network.

Feel free to use any data and models as you see fit.

gimseng commented 3 years ago

Perhaps it makes sense to split this into basic, advanced and neural network exercises as they might involve different techniques.

vishxm commented 3 years ago

So, basically markdown files are required to allow users to understand the topics better?

gimseng commented 3 years ago

@vishxm what I meant were creating 1-3 projects, each with different complexity of models. In each, there should be jupyter/python codes illustrating the use of various regularization techniques. Of course, markdown files are always assumed to explain the data/exercise/solution. Perhaps if you could check out some of the previous projects, it might be clearer.

If you have some idea to contribute to this, feel free to comment further here. Thanks !

nikunj-taneja commented 3 years ago

I would like to take up this exercise @gimseng. This is what I have in mind - a shallow neural network on MNIST/CIFAR10 trained with different kinds/combinations of regularisations using PyTorch and an analysis of results. Or maybe I could train a CNN using Google Colab and analyse results of using dropout vs not using dropout too along with another regularisations? The final results would be in the format of a Jupyter notebook. Let me know what you think!

zchoy001 commented 3 years ago

@gimseng Hi i would like to create a new project to further build on your existing linear regression code on housing to include various regression techniques, such as ridge, lasso to compare and illustrate the differences in results and techniques. Thanks! Please assign me. Please let me know if it is ok.

AnuravModak commented 10 months ago

Hello @gimseng ,

I'm eager to contribute to this task and help learners grasp the concepts of model regularization. I'm interested in creating an informative guide that covers various regularization techniques, ranging from the basic L1 and L2 (ridge/Lasso) regularization to more advanced methods like regularization hyperparameters in Support Vector Machines (SVM) and dropout regularization in neural networks.

My plan is to create a comprehensive tutorial that includes the following:

Introduction to Model Regularization: Providing an overview of why model regularization is crucial in preventing overfitting and improving generalization.

L1 and L2 Regularization: Explaining the concepts of L1 and L2 regularization, showcasing how they work to penalize model complexity, and their effects on feature selection and coefficient shrinkage. I'll include code examples using a linear regression model.

Regularization in SVM: Demonstrating how regularization is incorporated into SVM through the use of hyperparameters like C. I'll explain how adjusting these hyperparameters affects the trade-off between maximizing the margin and minimizing classification error. I'll provide code examples using the SVM classifier.

Dropout Regularization in Neural Networks: Introducing dropout as a technique to prevent overfitting in neural networks. I'll explain the dropout process, how it helps improve network robustness, and how to implement it using a simple neural network architecture. Code examples using a popular deep learning framework will be included.

Comparative Analysis: Comparing the effects of different regularization methods on a common dataset, showcasing how they impact model performance, convergence, and generalization. I'll provide visualizations to help learners understand these differences.

Discussion: Engaging in a discussion on the trade-offs and scenarios where different regularization methods are most effective, considering factors such as dataset size, model complexity, and the nature of the data.

For datasets, I'm considering:

Boston Housing Dataset: A classic regression dataset, suitable for illustrating L1 and L2 regularization techniques.

Iris Dataset: To showcase regularization in SVM, using a classification task.

MNIST Dataset: For demonstrating dropout regularization in a neural network, using image classification.

I'm open to suggestions and feedback on this plan. My goal is to provide a clear and accessible resource that empowers learners to understand and apply various model regularization techniques effectively.

Best regards, Anurav Modak