Closed joelowj closed 4 years ago
Dear Joel, in this case, we are performing a regression task, that is, we are trying to predict a number, namely the value of the future one month return. This is why we are using this variable. Of course, you can also use R1M_Usd_C and in this case you can choose either to run a regression tree (the binary values are considered as numbers again) - or to run a classification tree in which case the values are viewed as categories. You can control that via the "method" argument: method = "anova" is for regressions and method = "class" is for classification. rpart has a routine that decides which choice seems the best.
@shokru that certainly clarifies the doubt I have. I have mistaken rpart as a strictly discrete predictor. Thank you and have a good day!
Hi @shokru, thanks for this amazing piece of work. Just a question regarding, the code. I am not a R person so I could have missed out variable name change somewhere.
In Chapter 7.1.4 "We start with a simple tree and its interpretation. We use the package rpart and its plotting engine rpart.plot. The label is the future 1 month return and the features are all predictors available in the sample. The tree is trained on the full sample."
Judging by the code shouldn't it be "R1M_UsdC" instead of "R1M_Usd"? Thanks!