Personal repository of small notebooks for Machine Learning projects.
These notebooks are intended to be my playground with various tools and techniques, so they can have mistakes or incomplete code. I would be grateful for any comments which you can submit using GitHub Issues.
You may need to provide configuration for some notebooks. For that, copy .env.template file to .env and adjust some parameters.
Draft of a blog post on univariate feature selection.
Exercise for FastAI DL course: classification of bird images using CNN and transfer learning (executed on Azure).
Exercise for FastAI DL course: classification of Simpson character images using CNN and transfer learning (executed on Azure).
Exercise for FastAI DL course: classification of Simpson character images using CNN and transfer learning (executed on Colab).
Exercise for FastAI DL course: Quora insincere question classification using language model + LSTM (executed on Azure, didn't work well).
Exercise from book "Hands-On Machine Learning": basic data analysis of life satisfaction dataset.
Exercise from book "Hands-On Machine Learning": basic data analysis of housing dataset, plotting data on a map.
Exercise from book "Hands-On Machine Learning": basic experiments with TensorFlow API, TensorFlow graph.
Exercise from book "Hands-On Machine Learning": basic MNIST with DNN via TensorFlow.
Basic chatbot using LangChain.
Making a simple LLM call using LangChain.
Snippets showing basic usage of PySpark.
Basics of PyTorch: tensors, etc.
CIFAR10 via Simple CNN in PyTorch.
MNIST with PyTorch using simple CNN: from scratch, then using higher-level APIs.
Learning several sentences using Elman RNN.
Sentiment analysis (IMDB) using LSTM.
Little snippet to remove toolbar from FastAI notebooks on Azure.
Quora insincere question classification: logistic regression with hashing trick.
Quora insincere question classification: splitting train/test/valid data using stratified sampling.
Quora insincere question classification: analysis of important words for logistic regression classification.
Quora insincere question classification: simplest RNN (unfinished).
Jigsaw toxic comments dataset: simple exploratory data analysis.
Spam classification using Spacy, various classifiers.
Classic titanic dataset: data analysis, feature engineering, training models (logistic regression, SVM, decision tree, random forest, gradient boosting, including XGBoost, LightGBM, CatBoost).
Maximal marginal relevance.
Examples of helpful matplotlib graphs (to be extended).