h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.86k stars 2k forks source link

add DKV leak detection to the R tests #15083

Open exalate-issue-sync[bot] opened 1 year ago

exalate-issue-sync[bot] commented 1 year ago

The JUnit tests detect backend DKV leaks by checking for leaks at the end of each test. See TestUtil.java and this commit:

https://github.com/h2oai/h2o-3/commit/78302991ca882c76357b4ffce4219f1d86708aee

The R and Python tests need to do the same, to catch leak bugs due in the bindings. This will require two big steps:

  1. Add leak detection to the test runners, similar to TestUtil.java, by calling the REST API.

Currently there is no generic GET /3/DKV which will get all objects, so it will initially GET /3/Models and GET /3/Frames. This means that it won't initially detect leaked Vecs which aren't in any Frame. Once we add the generic GETs it will be able to do that.

Note that leak detection can only be done in a cluster which is idle, not while other tests are runing concurrently.

  1. Modify all the tests to individually DELETE the user-level objects that they create. E.g., if they parse into Frames these need to be DELETEd via DELETE /3/Frames/{key}; similarly, Models need to get cleaned up.

The tests must not clean up any objects which are created internally in the binding or the back end. The while point of this exercise is that the platform needs to clean these up automagically.

DinukaH2O commented 1 year ago

JIRA Issue Migration Info

Jira Issue: PUBDEV-2171 Assignee: New H2O Bugs Reporter: Raymond Peck State: Open Fix Version: N/A Attachments: N/A Development PRs: N/A