h2oai / h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
http://h2o.ai
Apache License 2.0
6.88k stars 1.99k forks source link

h2o cluster sever automatically shut off for too long #12037

Open exalate-issue-sync[bot] opened 1 year ago

exalate-issue-sync[bot] commented 1 year ago

fatal.log 12-14 06:35:04.660 192.168.184.26:54321 38826 #P-Accept FATAL: missing eom sentinel when opening new tcp channel 12-14 06:35:04.661 192.168.184.26:54321 38826 #P-Accept FATAL: Stacktrace: 12-14 06:35:04.662 192.168.184.26:54321 38826 #P-Accept FATAL: [java.lang.Thread.getStackTrace(Thread.java:1552), water.H2O.fail(H2O.java:1016), wat er.H2O.fail(H2O.java:1034), water.TCPReceiverThread.run(TCPReceiverThread.java:92)]

https://groups.google.com/forum/#!topic/h2ostream/YDyuRx-aEvg i try set the param .But don't work

exalate-issue-sync[bot] commented 1 year ago

Michal Malohlava commented: Can you please provide little bit of details for the reported problem - cluster size, deployment environment.

exalate-issue-sync[bot] commented 1 year ago

sid verstrum commented: Had this problem on Ubuntu 18.04 LTS (GNU/Linux 4.15.0-20-generic x86_64), h2o deployed with Docker. The cluster would shut down at the first connection attempt from lynx (and Chrome): #P-Accept FATAL: missing eom sentinel when opening new tcp channel

The jar was being launched from absolute path java -jar /opt/h2o.jar

Changed to this: cd /opt java -jar h2o.jar

Solved the problem. Now monioring if it keeps alive 24/7.

UPD: have a feeling you may also have this problem when the networking stack is misconfigured. E.g. you run H2O in Docker with it`s own network. Docker is run in Ubuntu VM and you access h2o from the host OS. There is a plenty of room for misconfiguration here.

hasithjp commented 1 year ago

JIRA Issue Migration Info

Jira Issue: PUBDEV-5165 Assignee: New H2O Bugs Reporter: Liang Wang State: Open Fix Version: N/A Attachments: N/A Development PRs: N/A