zachmayer / caretEnsemble

caret models all the way down :turtle:
Other
226 stars 75 forks source link

Fixing prediction methods to be more standardized #167

Closed jknowles closed 8 years ago

jknowles commented 8 years ago

This PR addresses #165 and #158 by making sure that predict.caretEnsemble and predict.caretStack provide similar output and have similar inputs. There are also large changes to the unit tests to make sure that the tests now test the correct structure for the output from these two functions.

predict.caretStack is particularly heavily modified here -- it now can produce prediction errors based on the model disagreement about each observation, mimicking the behavior of predict.caretEnsemble. It uses the variable importance of each models' predictions in the stacked model to define the weights. When these predictions are transformed before ensembling, no standard error is returned or calculated, but the variable importance can still be returned as a weights attribute to the outputted vector.

jknowles commented 8 years ago

@zachmayer This is a bit of a large PR, so just ping me with a comment when you have a chance to review it. I can answer any questions you have hopefully and am also willing to make whatever changes you think make sense.

jknowles commented 8 years ago

I'd also recommend changing the Git Check settings to only pick up git checks on the "NOT_CRAN" version of caretEnsemble because our testing infrastructure is too big to get full coverage within the CRAN test limits.

jknowles commented 8 years ago

@zachmayer Just a ping to see about moving this forward. I think it'll solve some of the recently reported issues like #172 .

zachmayer commented 8 years ago

This file has a couple untested lines: https://coveralls.io/builds/3539104/source?filename=R%2FcaretEnsemble.R

Could you add one or 2 more tests to hit the red lines? Meanwhile, I've gotta go figure out why our travis builds are failing.

zachmayer commented 8 years ago

Fixed the build issue, wooooooo

zachmayer commented 8 years ago

Also, instead of merging master, you can also rebase off master and then force push. It's a little bit harder to do, but makes for a slightly nicer commit history.

jknowles commented 8 years ago

OK, I think I did the rebase, but I'm not sure. At any rate we're even with master again. I think what needs to happen is I need to properly deprecate keepNA because I don't think it ever comes up anymore. Does that seem OK? Was going to use Hadley's trick here:

http://r-pkgs.had.co.nz/release.html

zachmayer commented 8 years ago

:+1: from me. I'm re-testing right now, and will merge when all the tests pass

jknowles commented 8 years ago

OK one last push to test my deprecation of keepNA.

jknowles commented 8 years ago

@zachmayer It builds, but the coverage check is a little wonky. I can't get those deprecated bits to be tested, but I don't want to junk them because legacy code may use them (I'm speaking selfishly here). If we merge this, I'm happy to figure out the merge conflicts it might cause with the other PR.

zachmayer commented 8 years ago

I edited the thresholds and re-ran

jknowles commented 8 years ago

@zachmayer Looks like we're good to go. Thanks! After this gets merged in, I'll re-base and start tackling the two issues that cropped up related to prediction #172 and #171. This closes #167.

zachmayer commented 8 years ago

Awesome!  I'm away for the weekend but will merge asap so we can move on!