cimentadaj / ml_socsci

A work-in-progress of the notes/book 'Machine Learning for Social Science'
https://cimentadaj.github.io/ml_socsci/
6 stars 0 forks source link

Check that bagging is explained correctly from the manual approach #3

Open cimentadaj opened 4 years ago

cimentadaj commented 4 years ago

You're 99% percent sure that it's correct but you need to get someone elses opinion. For the manual implementation, you're currently implementy 100 bootstraps of 60% of the data, training a small decision tree and predicting on the original data such that you have 100 predictions for each row.

Initial I thought the predictions were being done on the same bootstraped data but after a conversation with Jorge Lopez, this doesn't make sense but a bagged tree with a few bootstraps runs the risk of not including some rows in the bootstrap and thus get no predictions.