We need to divide a big programming problem into smaller sub-problems before we can effectively implement it.
Top-down design is a systematic technique for breaking a problem down into small subtask functions.
In top-down design, we seek small functions that solve well-defined tasks and that can be used by one or more other functions.
Author identification is the process of guessing the author of a mystery book.
We can use features about words (e.g., average word length) and sentences (e.g., average number of words per sentence) to characterize how each known author writes.
Machine learning is an important area of computer science that investigates how machines can learn from data and make predictions.
In supervised learning, we have some training data in the form of objects (e.g., books) and their categories (who wrote each book). We can learn from that data to make predictions about new objects.
A signature consists of a list of features, one signature per object.
When we’re ready to implement our functions that arose from top-down design, we implement them from the bottom up; that is, we implement the leaf functions first, then functions that depend on those leaf functions, and so on until we implement the topmost function.
Refactoring code means to improve the design of the code (e.g., by reducing code repetition).
Summary
本章小结