h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.94k stars 2k forks source link

Connection Refused Error with H2O on Windows Subsystem for Linux (WSL) after Windows Update #16424

Open lvfmc85 opened 1 month ago

lvfmc85 commented 1 month ago

H2O version, Operating System, and Environment

H2O Versions: 3.42.0.2 up to 3.46.0.5

Operating System: Windows with WSL (Linux kernel version 5.15.153.1-microsoft-standard-WSL2)

WSL Distribution: Ubuntu 24.04

Java Version: Java 11.0.24

Python Version: Tested with Python 3.7, 3.9, 3.10, 3.11, and 3.12

Anaconda Version: Anaconda3-2024.06-1-Linux-x86_64

Memory available (.wslconfig): 42 GB

Actual behavior When starting H2O on WSL using h2o.init(), it attempts to form a cloud of size 2, including a node at 10.255.255.254:54321, but encounters a "Connection refused" error. This issue does not occur on a dedicated PC with Ubuntu OS.

Expected behavior H2O should start successfully on WSL, forming a cloud and connecting to the node without any connection errors, as it does on a native Ubuntu environment.

Steps to reproduce

Start H2O using h2o.init() in Python on WSL.

Attempt to form a cloud with node at 10.255.255.254:54321.

Observe the "Connection refused" error.

Upload logs I have reviewed various H2O logs, including .err, .out, .trace, .debug, .info, .warn, .error, and httpd.log, all indicating the same connection error. Logs can be provided upon request.

Screenshots If applicable, please let me know if you need screenshots of the error messages.

Additional context

The connection issue seems to have started after a Windows update.

The issue is specific to WSL as H2O works perfectly on a native Ubuntu installation.

Steps taken to resolve the issue include forcing a local connection, checking network configurations, updating WSL, and analyzing logs, all without success.

Last successful use was on 22/05/2024 with H2O version 3.42.0.2. The issue was detected around 20/09/2024.

Seeking assistance for potential solutions or insights into whether a recent update might have affected network configurations on WSL.

lvfmc85 commented 1 month ago

Hello everyone,

After further investigation, I've identified the root cause of the issue with running H2O in WSL. The problem arises specifically after updating from WSL version 2.1.5 to 2.2.1.

What I Found:

The H2O execution works perfectly in WSL version 2.1.5.

The issue appears immediately after updating to WSL version 2.2.1.

Changes in version 2.2.1, such as enabling DNS tunneling by default and other kernel updates, might be contributing factors.

Solution:

For now, the solution is to revert to WSL version 2.1.5, where H2O runs without any problems. Here are the steps I took to resolve the issue:

Uninstall WSL

Removed the current WSL version using PowerShell. wsl --uninstall

Obs.: Although I haven't done it, I think it would be a good safety measure to make a backup of the Linux distro before uninstalling WSL.

Install WSL Version 2.1.5: Downloaded and installed the WSL 2.1.5 https://github.com/microsoft/WSL/releases/tag/2.1.5

This workaround allows me to use H2O effectively until a permanent fix is provided in future updates of WSL.

I hope this helps anyone facing similar issues. I think this is more of a WSL issue than an h2o issue, but if the h2o team needs more information or further testing to help resolve this, please let me know.