h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.94k stars 2k forks source link

GH-16360: Fix R package for Windows #16369

Closed tomasfryda closed 2 months ago

tomasfryda commented 3 months ago

16360

The Windows usually don't allow opening one file by multiple processes and that seems to cause this issue.

In python, it seems like we are using the same process group which IIRC could mitigate the issue we're seeing in R.

Last time I programmed on Windows, Delphi was a big thing and Windows XP was the latest version so my knowledge of that "operating system" is pretty limited. So I'm trying to keep it simple and multiplatform without delving into windows' and R's internals.

My solution is to add the web_ip to the json we send when H2O connects and then check on R side if it's null and if so print the warning. This way we print it every time on h2o.init and also on h2o.connect (which I think is a good thing) but it makes the behavior between R and Python diverge a bit.

@mmalohlava JFYI this related to https://github.com/h2oai/h2o-3/issues/15683 which you were involved in.

tomasfryda commented 3 months ago

I'm rerunning the tests since I didn't have the milestone assigned.

We could make this more general by adding a field "security_warnings" and send that on initialization. On Windows with the current implementation (before this fix) the R starts h2o but doesn't connect to it which is bad since user can't use h2o and we might end up having a process that listens on all interfaces unmanaged without people knowing that it is running.

wendycwong commented 2 months ago

@tomasfryda : I do have a windows machine. Let me know what tests you want to run on a windows machine to make sure everything works.

tomasfryda commented 2 months ago

@wendycwong all you have to do is to install the package and call:

library(h2o)
h2o.init()

It should not fail with access denied.

It's probably not necessary to install the package but you have to call h2o.init() while there is no running instance of h2o backend otherwise it would try to connect to it.

I tested it like a month ago and actually thought this was already merged and released in 3.46.0.5.