koalaverse / homlr-2ed

Code and resources for the 2nd edition of "Hands-on Machine Learning with R: An applied book covering the fundamentals of machine learning with R" by Boehmke & Greenwell
https://koalaverse.github.io/homlr-2ed/
MIT License
6 stars 0 forks source link

Potential book layout #14

Open bradleyboehmke opened 8 months ago

bradleyboehmke commented 8 months ago

Fundamentals

  1. Introduction to ML
    • What is ML
    • Types of ML systems
    • ML in R
  2. Before the modeling process
    • Problem framing
    • Planning & scoping
    • Experimentation
    • Production
  3. The basic modeling process
    • Data splitting
    • Building models
    • Making predictions
    • Model evaluation
      • Understanding residuals
      • Aggregate residual metrics
      • Performance plots (i.e. ROC curve, lift chart)
  4. Data preprocessing
    • Target engineering
    • Missing values
    • Feature filtering
    • Numeric feature engineering
    • Categorical feature engineering
    • Data compression (PCA)
  5. A more robust modeling process
    • Bias-variance trade-off
    • Resampling
    • Hyperparameter tuning
  6. Model trust
    • Ethics
    • Interpretability vs. Explainability
    • Global explainability
    • Local explainability

Supervised Modeling

  1. Linear regression
  2. Logistic regression
    • Add section on Multinomial problems
  3. Regularized regression
  4. Transitioning to non-linearity
    • Polynomial
    • MARS
    • GAMS
  5. KNN
  6. Decision trees
  7. Bagging
  8. Random forests
  9. Gradient boosting
  10. Support vector machines
  11. Stacked models

Deep Learning

  1. Intro to DL
  2. The DL modeling process
  3. Transfer learning
  4. Computer vision
  5. Word embeddings
  6. Language models
bradleyboehmke commented 8 months ago

@bgreenwell, I thought a lot about our recent discussions and it made me go back and reconsider the layout. Above is a proposed new TOC layout. The middle section doesn't change a whole lot but the first section adds some new content that I think would help set the book apart.

For example, ch 2 would talk about framing and scoping ML problems along with thinking about production concerns. This is where we can mention things around the lifecycle of an ML project (i.e. drift) but we mention that our book does not focus on this topic (we can point to other resources).

Also, notice that I remove the unsupervised section but add in a DL section. This modernizes the book plus, I already have a lot of DL notebooks built out that I can migrate so this is starting from scratch.

What are your thoughts?

bgreenwell commented 7 months ago

Lots to discuss at our next catch-up, but here's some (very) high-level thoughts: