Open agila5 opened 3 years ago
Will need to take a look this upcoming week. For some reason the date index column is getting passed to H2O AutoML, which is no good. We only want features derived from that date column to be passed.
Ok, I will wait for your response. Thank you very much!
Hi @mdancho84 ,
thanks for this package and all the documentation!
I am also trying to use {modeltime.h2o}
with datetime
data (hourly-specific).
Let me know if you would need a reprex :)
Hi @mdancho84 ,
I was doing some research on this issue, and I think this is an H2O issue, not specific to {modeltime.h2o}
(I will submit a PR to H2O about this).
This is the reprex I built, that would fail for both {modeltime.h2o}
and {h2o}
:
> library("dplyr")
> library("h2o")
> library("lubridate")
> hourly_calls <- tibble(
+ ds = seq.POSIXt(now() - days(7), now(), by = "hour"),
+ calls = rpois(7 * 24 + 1, lambda = 4)
+ )
> h2o.init()
> as.h2o(hourly_calls)
ERROR: Unexpected HTTP Status code: 412 Precondition Failed (url = http://localhost:54321/3/Parse)
water.exceptions.H2OIllegalArgumentException
[1] "water.exceptions.H2OIllegalArgumentException: Provided column type POSIXct is unknown. Cannot proceed with parse due to invalid argument."
...
[41] " java.base/java.lang.Thread.run(Thread.java:829)"
Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = page, :
ERROR MESSAGE:
Provided column type POSIXct is unknown. Cannot proceed with parse due to invalid argument.
However, I found something curious while checking this out. In automl_fit_impl
function definition, on line 180 we are using a variable named data
which is not provided as input and neither it is in ...
by default. The conditional is evaluating always to TRUE
as it is using utils::data
function. I guess this line should be:
if (!inherits(x, "H2OFrame")) {
Would you like me to fix this in a PR?
Oh wow, that's funny about the data
bug. Yes, please feel free to submit a PR. We can fix that.
Regarding the POSIXct format, that is interesting. I'm surprised we are passing it as a feature, but it's been a while since I reviewed the code.
One solution is to simply remove the name of the datetime feature from the x_nms
variable. This would prevent the POSIXct is unknown error.
Well, the POSIXct
column is my response variable, so I need to get it passed to H2O.
I have submitted this PR to H2O so they get this issue solved. Meanwhile, @agila5 , you could try installing https://github.com/jcrodriguez1989/h2o-3/tree/ash2o_posixct/h2o-r/h2o-package , but it is not as easy as remotes::install_github(...)
. Let me know if you need further help installing it.
With this H2O fix, I am being able to run modeltime.h2o
with datetime response variable 💃💃💃
Hi @jcrodriguez1989 and thank you very much for your message, I will test your solution/PR as soon as possible.
Dear @mdancho84 , first of all thank you very much for developing this amazing ecosystem for time series modelling.
I just started learning the basic ideas and packages, and I was wondering if it's possible to use
h2o
models with date-time (orPOSIXct
) objects. For example, I tried to replicate the introductory vignette changing the fieldDate
fromDate
toPOSIXct
, but then the examples fail.Created on 2021-05-22 by the reprex package (v2.0.0)
Can you help me?