Closed blookot closed 4 years ago
Pinging @elastic/ml-core (:ml)
@blookot Could you please explain how you're indexing the data?
I'm loading the csv file using data visualizer @dimitris-athanasiou
Thank you @blookot. I have reproduced the issue. You have uncovered a bug that is caused because there are no features in this dataset. There is only the dependent_variable.
I think there are 2 issues to fix here:
We'll proceed to fix them both.
Once again, thank you for reporting this. It helps us make the feature better!
Hi @dimitris-athanasiou why can't we use timestamp as a feature? in my case it's a disk slowly filling, and i'd like to use regression and inference to predict when my disk is gonna be full. i can plot timestamp on x and disk usage on y and have a nice dot chart... i guess this falls into the single metric ML (temporal) with forecast...
PS. CPU is running at 100% (on my ML node) until I stop the job!
Indeed, your use case is a time series analysis. You can use an anomaly detection job to model the data and then use the forecast feature in order to predict when the disk will be full.
Having said that, we're planning to revisit date
features for data frame analytics jobs. We have not addressed them yet as they require special handling that we decided to defer until later in the project. This is not a promise that we'll support them though.
PS. CPU is running at 100% (on my ML node) until I stop the job!
Thanks for the note! I noticed that too. We'll make sure to fix this issue.
yes i've been playing (successfully) with single metric & forecast i thought dates are stored as long (like unix epoch) so I imagined a 2D dot plot with the regression based on my timestamp and disk usage... But I'll wait for it :-) thanks again @dimitris-athanasiou
We have not addressed them yet as they require special handling that we decided to defer until later in the project.
Just to add to this, the regression model we use isn't immediately well suited to extrapolation, as needed for forecasting. To get it to work in this fashion needs some explicit handling in inference and also judicious feature creation. As @dimitris-athanasiou says, using this functionality to enhance our forecasting capabilities (particularly to include additional explanatory variables) is definitely something on the roadmap.
Elasticsearch version (
bin/elasticsearch --version
): 7.6.2JVM version (
java -version
): running on ESSDescription of the problem including expected versus actual behavior:
i'm runnng a regression data frame analytics job and it stops at 50% (loading data is 100% and analyzing is 0%) can't understand why...
Steps to reproduce:
Here is an example of ml job:
logs don't tell anything:
disk_usage.txt