Closed sumny closed 4 years ago
Thank you very much, I like it a lot. Added some minor hints and improvements, nothing drastic.
@pfistfl @jakob-r Thank you very much for your review! I agree with all your suggestions and made the appropriate changes in the second revision. After proofreading it again, I noticed that 2 * pi * x / max_x
is somewhat incorrect. I guess it should be the supremum of x (or rather sup(x) + 1), because suppose you work with hours (0:23) then you would divide by 24 regardless of whether your vector x contains the hour 23. On the other hand I do not want to be that picky. Any suggestions here? Also I guess we should wait with merging until the PR of PipeOpDateFeatures is finally integrated into the master
do max_x +1 that seems good.
Using max
looks like a bad idea anyhow. What if in the training data range(x) = c(0,14)
. Then 14h would be quite close to 0h, right? We should always divide by 24!
Can you check what is done in the code?
Using
max
looks like a bad idea anyhow. What if in the training datarange(x) = c(0,14)
. Then 14h would be quite close to 0h, right? We should always divide by 24! Can you check what is done in the code?
In the code the maximum is always "hard-coded", so (for hours) I divide by 24. That's why I thought sup(x) + 1
or something like this would be more precise.
Please, in the future: Don't use automatic line breaks. It makes it hard to read diffs. A line break for a new sentence is usually what we are using.
Please, in the future: Don't use automatic line breaks. It makes it hard to read diffs. A line break for a new sentence is usually what we are using.
I will keep this in mind!
Forgot, that we wanted to wait until https://github.com/mlr-org/mlr3pipelines/pull/308 is merged. Maybe you can add a note in the beginning that you need to install mlr-org/mlr3pipelines@pipeop_datefeatures
or merge the PR quickly.
This tutorial shows how to do some simple feature engineering using the
PipeOpDateFeatures
pipeline. Happy to get some feedback!