Contains edits to Overview and M4.1 and the redrafting of M4.2 (currently in planning stage).
The plan for M4.2 redraft is:
We have learned about parameters and probability distributions.
Models are specified by parameters.
The simplest model is the average.
Models are fitted to data through a cost function.
The average has the minimum cost.
(do a multi-panel graph showing this)
But this doesn't capture variability. We3 can fit a normal distribution.
Two common ways of fitting. MLE, RMSE. These cost functions have a global minimum. Cost functions can get very complicated.
(Code and graph)
We won't cover the algorithms of how models get fitted.
Modelling relationships: Regression.
Most RQs are interested in modelling relationships between variables. Instead of modelling directly the parameters of a distribution we can make the parameters dependent on on other variables.
The simplest method of modelling relationships is to assign a model parameter for each input variable that you want to consider.
More complex models but from adding in more variables and mathematically specifying the nature of the relationship between variables (e.g. interactions).
Each input to your model, and it's associated parameter, called coefficients can be thought of as a mini-hypothesis: given that I know all the other inputs to the model, what how much does this input contribute to the model?
How many parameters?
It depends what you are interested in and what you want to learn.
With enough parameters you can fit anything. But the model will not generalise well. We could just be fitting to noise.
(Bishop toy example).
From linear regression to logistic regression.
The data considered so far deal with a continuous outcome variable, or a normal distribution.
We have dichotomised our variable.
Logistic regression is conceptrually similar in that we predict a parameter of a distribution, only this time it is a bernouilli distribution.
We obtain category classes by making the assumption that if the mean of the bernouilli distribution is below X. We class the prediction as Y.
We will step through the development at the beginning of the next section.
[qu: how much of the logistic regression dev in the next section do we include here?]
Contains edits to
Overview
andM4.1
and the redrafting ofM4.2
(currently in planning stage).The plan for M4.2 redraft is:
We have learned about parameters and probability distributions.
Models are specified by parameters.
The simplest model is the average.
Models are fitted to data through a cost function.
The average has the minimum cost. (do a multi-panel graph showing this)
But this doesn't capture variability. We3 can fit a normal distribution.
Two common ways of fitting. MLE, RMSE. These cost functions have a global minimum. Cost functions can get very complicated.
(Code and graph)
We won't cover the algorithms of how models get fitted.
Modelling relationships: Regression.
Most RQs are interested in modelling relationships between variables. Instead of modelling directly the parameters of a distribution we can make the parameters dependent on on other variables.
The simplest method of modelling relationships is to assign a model parameter for each input variable that you want to consider.
More complex models but from adding in more variables and mathematically specifying the nature of the relationship between variables (e.g. interactions).
Each input to your model, and it's associated parameter, called coefficients can be thought of as a mini-hypothesis: given that I know all the other inputs to the model, what how much does this input contribute to the model?
How many parameters?
It depends what you are interested in and what you want to learn.
With enough parameters you can fit anything. But the model will not generalise well. We could just be fitting to noise.
(Bishop toy example).
From linear regression to logistic regression.
The data considered so far deal with a continuous outcome variable, or a normal distribution.
We have dichotomised our variable.
Logistic regression is conceptrually similar in that we predict a parameter of a distribution, only this time it is a bernouilli distribution.
We obtain category classes by making the assumption that if the mean of the bernouilli distribution is below X. We class the prediction as Y.
We will step through the development at the beginning of the next section.
[qu: how much of the logistic regression dev in the next section do we include here?]