Closed haoawesome closed 10 years ago
https://news.ycombinator.com/item?id=1055042
Mike Jordan at Berkeley sent me his list on what people should learn for ML. The list is definitely on the more rigorous side (ie aimed at more researchers than practitioners), but going through these books (along with the requisite programming experience) is a useful, if not painful, exercise.
I personally think that everyone in machine learning should be (completely) familiar with essentially all of the material in the following intermediate-level statistics book:
1.) Casella, G. and Berger, R.L. (2001). "Statistical Inference" Duxbury Press. http://www.amazon.com/Statistical-Inference-George-Casella/dp/0534243126
For a slightly more advanced book that's quite clear on mathematical techniques, the following book is quite good:
2.) Ferguson, T. (1996). "A Course in Large Sample Theory" Chapman & Hall/CRC.
You'll need to learn something about asymptotics at some point, and a good starting place is:
3.) Lehmann, E. (2004). "Elements of Large-Sample Theory" Springer.
Those are all frequentist books. You should also read something Bayesian:
4.) Gelman, A. et al. (2003). "Bayesian Data Analysis" Chapman & Hall/CRC.
and you should start to read about Bayesian computation:
5.) Robert, C. and Casella, G. (2005). "Monte Carlo Statistical Methods" Springer.
On the probability front, a good intermediate text is:
6.) Grimmett, G. and Stirzaker, D. (2001). "Probability and Random Processes" Oxford.
At a more advanced level, a very good text is the following:
7.) Pollard, D. (2001). "A User's Guide to Measure Theoretic Probability" Cambridge.
The standard advanced textbook is Durrett, R. (2005). "Probability: Theory and Examples" Duxbury.
Machine learning research also reposes on optimization theory. A good starting book on linear optimization that will prepare you for convex optimization:
8.) Bertsimas, D. and Tsitsiklis, J. (1997). "Introduction to Linear Optimization" Athena.
And then you can graduate to:
9.) Boyd, S. and Vandenberghe, L. (2004). "Convex Optimization" Cambridge.
Getting a full understanding of algorithmic linear algebra is also important. At some point you should feel familiar with most of the material in
10.) Golub, G., and Van Loan, C. (1996). "Matrix Computations" Johns Hopkins.
It's good to know some information theory. The classic is:
11.) Cover, T. and Thomas, J. "Elements of Information Theory" Wiley.
Finally, if you want to start to learn some more abstract math, you might want to start to learn some functional analysis (if you haven't already). Functional analysis is essentially linear algebra in infinite dimensions, and it's necessary for kernel methods, for nonparametric Bayesian methods, and for various other topics. Here's a book that I find very readable:
12.) Kreyszig, E. (1989). "Introductory Functional Analysis with Applications" Wiley.
内容太多了
@googya 这些书还是比较深的,这都是照机器学习博士水准定位的 “I don't expect anyone to come to Berkeley having read any of these books in entirety, but I do hope that they've done some sampling and spent some quality time with at least some parts of most of them. Moreover, not only do I think that you should eventually read all of these books (or some similar list that reflects your own view of foundations), but I think that you should read all of them three times---the first time you barely understand, the second time you start to get it, and the third time it all seems obvious.”
抱歉,我们工作有误,这四本书也都是面向机器学习博士的。
老书单: https://news.ycombinator.com/item?id=1055042
新书 source: http://www.reddit.com/r/MachineLearning/comments/2fxi6v/ama_michael_i_jordan/ckdqzph
"That list was aimed at entering PhD students at Berkeley, who are whom I assume are going to devote many decades of their lives to the field, and who want to get to the research frontier fairly quickly. I would have prepared a rather different list if the target population was (say) someone in industry who needs enough basics so that they can get something working in a few months.
That particular version of the list seems to be one from a few years ago; I now tend to add some books that dig still further into foundational topics. In particular, I recommend A. Tsybakov's book "Introduction to Nonparametric Estimation" as a very readable source for the tools for obtaining lower bounds on estimators, and Y. Nesterov's very readable "Introductory Lectures on Convex Optimization" as a way to start to understand lower bounds in optimization. I also recommend A. van der Vaart's "Asymptotic Statistics", a book that we often teach from at Berkeley, as a book that shows how many ideas in inference (M estimation---which includes maximum likelihood and empirical risk minimization---the bootstrap, semiparametrics, etc) repose on top of empirical process theory. I'd also include B. Efron's "Large-Scale Inference: Empirical Bayes Methods for Estimation, Testing, and Prediction", as a thought-provoking book"
http://www.amazon.com/Introduction-Nonparametric-Estimation-Springer-Statistics/dp/1441927093 Introduction to Nonparametric Estimation
http://www.amazon.com/Introductory-Lectures-Convex-Optimization-Applied/dp/1402075537 Introductory Lectures on Convex Optimization
http://www.amazon.com/Asymptotic-Statistics-Statistical-Probabilistic-Mathematics/dp/0521784506 Asymptotic Statistics
http://www.amazon.com/Large-Scale-Inference-Estimation-Prediction-Mathematical/dp/110761967X
Large-Scale Inference: Empirical Bayes Methods for Estimation, Testing, and Prediction