Open rgrohit opened 9 years ago
Would it be possible to use the bootstrapping technique to decrease bias within a data set? I know this is done in social sciences, but there are definitely some implications when doing this in a medical environment.
A possible way to reduce both bias and variance is to consider more data points, or to increase the considered training examples. I guess you only mean the bias of the model itself, but bias can also occur during the data collection. As an example, dMRI tractography is often used to estimate structural connections between brain areas, but quantitative measures of connectivity based on the streamline distribution in the brain such as density, average length and spatial extent (volume) are biased by erroneous streamlines produced by tractography algorithms. I guess finding a way to reduce this bias would be one method of decreasing the bias without changing the model itself.
I think the most popular example of decreasing bias is in the regression problem. Linear regression draws the line that minimizes square error, but it is seldom the case that the error is zero. But if you have an n-th order polynomial, where n is the number of data points, you should have a bias of zero. Thus, as you increase the order of the polynomial, bias is decreased.
For an vivid example, here is a video that supplements mrjiaruiwang's explanation. (Ignore all unfamiliar terms, just focus on how the speaker drew the plot to illustrate the relationship.) http://youtu.be/rW0B8o7JtFk?list=PLD0F06AA0D2E8FFBA and http://youtu.be/W0NLs-A6hhQ?list=PLD0F06AA0D2E8FFBA
On Tuesday we discussed decreasing bias by being less precise. With Kmeans, we could increase K. Are there other ways or examples of decreasing bias outside of changing the model being used?