mc2-project / federated-xgboost

Federated gradient boosted decision tree learning
68 stars 20 forks source link

One machine simulation #23

Open Hiramdu opened 2 years ago

Hiramdu commented 2 years ago

Hi team, may you help me check this error?

failed in ReconnectLink [20:28:59] /home/ubuntu/gdu/federated-xgboost/rabit/include/rabit/internal/ssl_socket.h:26: PK - Read/write of file failed

Traceback (most recent call last): File "/home/ubuntu/gdu/federated-xgboost/demo/basic/demo.py", line 31, in bst = fxgb.train(params, dtrain, num_rounds, evals=[(dtrain, "dtrain"), (dval, "dval")]) File "/home/ubuntu/gdu/federated-xgboost/python-package/federatedxgboost/training.py", line 216, in train xgb_model=xgb_model, callbacks=callbacks) File "/home/ubuntu/gdu/federated-xgboost/python-package/federatedxgboost/training.py", line 74, in _train_internal bst.update(dtrain, i, obj) File "/home/ubuntu/gdu/federated-xgboost/python-package/federatedxgboost/core.py", line 1109, in update dtrain.handle)) File "/home/ubuntu/gdu/federated-xgboost/python-package/federatedxgboost/core.py", line 176, in _check_call raise XGBoostError(py_str(_LIB.XGBGetLastError())) federatedxgboost.core.XGBoostError: [20:29:04] /home/ubuntu/gdu/federated-xgboost/include/xgboost/./tree_model.h:295: Check failed: fi->Read(&param, sizeof(TreeParam)) == sizeof(TreeParam) (0 vs. 148) :

rctp commented 2 years ago

Any solution for this? I also encounter this problem.

alx338 commented 1 year ago

Hi, same problem here. In my case it is 2 workers in 2 different folders on 2 different ports. Is there any idea how to solve this?