H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
The R and Python tests need to do the same, to catch leak bugs due in the bindings. This will require two big steps:
Add leak detection to the test runners, similar to TestUtil.java, by calling the REST API.
Currently there is no generic GET /3/DKV which will get all objects, so it will initially GET /3/Models and GET /3/Frames. This means that it won't initially detect leaked Vecs which aren't in any Frame. Once we add the generic GETs it will be able to do that.
Note that leak detection can only be done in a cluster which is idle, not while other tests are runing concurrently.
Modify all the tests to individually DELETE the user-level objects that they create. E.g., if they parse into Frames these need to be DELETEd via DELETE /3/Frames/{key}; similarly, Models need to get cleaned up.
The tests must not clean up any objects which are created internally in the binding or the back end. The while point of this exercise is that the platform needs to clean these up automagically.
The JUnit tests detect backend DKV leaks by checking for leaks at the end of each test. See TestUtil.java and this commit:
https://github.com/h2oai/h2o-3/commit/78302991ca882c76357b4ffce4219f1d86708aee
The R and Python tests need to do the same, to catch leak bugs due in the bindings. This will require two big steps:
Currently there is no generic GET /3/DKV which will get all objects, so it will initially GET /3/Models and GET /3/Frames. This means that it won't initially detect leaked Vecs which aren't in any Frame. Once we add the generic GETs it will be able to do that.
Note that leak detection can only be done in a cluster which is idle, not while other tests are runing concurrently.
The tests must not clean up any objects which are created internally in the binding or the back end. The while point of this exercise is that the platform needs to clean these up automagically.