chiphuyen / dmls-book

Summaries and resources for Designing Machine Learning Systems book (Chip Huyen, O'Reilly 2022)
https://www.amazon.com/Designing-Machine-Learning-Systems-Production-Ready/dp/1098107969
2.26k stars 332 forks source link

Designing Machine Learning Systems (Chip Huyen 2022)

Machine learning systems are both complex and unique. Complex because they consist of many different components and involve many different stakeholders. Unique because they're data dependent, with data varying wildly from one use case to the next. In this book, you'll learn a holistic approach to designing ML systems that are reliable, scalable, maintainable, and adaptive to changing environments and business requirements.

The book has been translated into Spanish, Japanese, Korean, Polish, and Thai.

The book is available on:

and most places where technical books are sold.

Repo structure

This book focuses on the key design decisions when developing and deploying machine learning systems. This is NOT a tutorial book, so it doesn't have a lot of code snippets. In this repo, you won't find code examples, but you'll find:

Contributions

You're welcome to create issues or submit pull requests. Your feedback is much appreciated!

Who This Book Is For

This book is for anyone who wants to leverage ML to solve real-world problems. ML in this book refers to both deep learning and classical algorithms, with a leaning toward ML systems at scale, such as those seen at medium to large enterprises and fast-growing startups. Systems at a smaller scale tend to be less complex and might benefit less from the comprehensive approach laid out in this book.

Because my background is engineering, the language of this book is geared toward engineers, including ML engineers, data scientists, data engineers, ML platform engineers, and engineering managers.

You might be able to relate to one of the following scenarios:

  1. You have been given a business problem and a lot of raw data. You want to engineer this data and choose the right metrics to solve this problem.
  2. Your initial models perform well in offline experiments and you want to deploy them.
  3. You have little feedback on how your models are performing after your models are deployed, and you want to figure out a way to quickly detect, debug, and address any issue your models might run into in production.
  4. The process of developing, evaluating, deploying, and updating models for your team has been mostly manual, slow, and error-prone. You want to automate and improve this process.
  5. Each ML use case in your organization has been deployed using its own workflow, and you want to lay down the foundation (e.g., model store, feature store, monitoring tools) that can be shared and reused across use cases.
  6. You’re worried that there might be biases in your ML systems and you want to make your systems responsible!

You can also benefit from the book if you belong to one of the following groups:

Review

See what people are talking about the book on Twitter @designmlsys!




Chip Huyen, Designing Machine Learning Systems. O'Reilly Media, 2022.

@book{dmlsbook2022,  
    address = {USA},  
    author = {Chip Huyen},  
    isbn = {978-1801819312},   
    publisher = {O'Reilly Media},  
    title = {{Designing Machine Learning Systems}},  
    year = {2022}  
}