NVIDIA / NVFlare

NVIDIA Federated Learning Application Runtime Environment
https://nvidia.github.io/NVFlare/
Apache License 2.0
596 stars 170 forks source link

[BUG] Assertion Error while running xgboost example #2791

Closed hwpang closed 2 weeks ago

hwpang commented 1 month ago

Describe the bug I am a new user to NVFlare. I tried running the example at ~/NVFlare/examples/advanced/xgboost/histogram-based, but I encountered the following error:

  File "/home/haowei/miniforge3/envs/nvflare_env/lib/python3.10/site-packages/nvflare/app_opt/xgboost/histogram_based/executor.py", line 277, in train
    with xgb.collective.CommunicatorContext(**communicator_env):
  File "/home/haowei/miniforge3/envs/nvflare_env/lib/python3.10/site-packages/xgboost/collective.py", line 280, in __enter__
    assert is_distributed()
AssertionError

To Reproduce Steps to reproduce the behavior:

  1. Go to ~/NVFlare/examples/advanced/xgboost and run prepare_data.sh and prepare_job_config.sh
  2. Go into histogram-based folder and run run_experiment_simulator.sh
  3. See the same error trace

Expected behavior Expected to complete the example without error.

Desktop (please complete the following information):

This seems to be related to https://github.com/dmlc/xgboost/pull/10503. However, note that I am using version of xgboost 2.1.1, which should contain the patch fix.

YuanTingHsieh commented 1 month ago

@hwpang hello, thanks for raising the issue.

Yes, the xgboost's repo's federated API has been changing in their newer releases. We have been working with them closely. We are working on updating our examples to accompany that.

Right now, please either use one of the following: (1) NVFlare main branch + xgboost from this: https://s3-us-west-2.amazonaws.com/xgboost-nightly-builds/list.html?prefix=federated-secure/ (2) NVFlare 2.4.0 with xgboost 2.0.3

YuanTingHsieh commented 3 weeks ago

Ok, after our fixes, please use one of the following: (1) NVFlare main branch + xgboost from this: https://s3-us-west-2.amazonaws.com/xgboost-nightly-builds/list.html?prefix=federated-secure/ (2) NVFlare 2.4 branch with xgboost 2.1.1

hwpang commented 2 weeks ago

Thanks, this works!