ray-project / xgboost_ray

Distributed XGBoost on Ray
Apache License 2.0
137 stars 34 forks source link

How to use bst.eval_set() and bst.update() with xgboost_ray #248

Open Jeffwan opened 1 year ago

Jeffwan commented 1 year ago

I am trying to adopt xgboost_ray for a xgboost project. Currently I meet a problem. The original code is doing some fine grain control on the training process. for every iteration

       eval_results = self.bst.eval_set(
            evals=[(self.dmat_train, "train"), (self.dmat_valid, "valid")], iteration=self.bst.num_boosted_rounds() - 1
        )
        self.log_info(fl_ctx, eval_results)
        auc = float(eval_results.split("\t")[2].split(":")[1])
        for i in range(self.trees_per_round):
            self.bst.update(self.dmat_train, self.bst.num_boosted_rounds())

        # extract newly added self.trees_per_round using xgboost slicing api
        bst = self.bst[self.bst.num_boosted_rounds() - self.trees_per_round : self.bst.num_boosted_rounds()]

code source: https://github.com/NVIDIA/NVFlare/blob/dev/nvflare/app_opt/xgboost/tree_based/executor.py#L153-L174

Note: I already get bst object from xgboost_ray.train()

There're two blockers, they are bst.eval_set() and bst.update() since bst is from xgboost library, it won't accept RDMatrix which throws an error here.

  File "/usr/local/lib/python3.8/site-packages/xgboost/core.py", line 1980, in eval_set
    raise TypeError(f"expected DMatrix, got {type(d[0]).__name__}")
TypeError: expected DMatrix, got RayDMatrix

I look at the documentation and can not find the replacement like predict. How can I make it?

/cc @Yard1

Yard1 commented 1 year ago

It looks like you are implementing your own training loop. This goes beyond what xgboost-ray provides out of the box.

You'd most likely need to subclass the internal RayXGBoostActor (xgboost_ray/main.py) and replace the logic inside the predict method, which is ran on every worker using normal xgboost (which is configured to communicate with other workers through the rabit tracker). We do not provide an API to pass your own Actor class, so you'll have to most likely monkey-patch it.

I would be happy to look into making this process smoother by providing developer APIs.

Jeffwan commented 1 year ago

This goes beyond what xgboost-ray provides out of the box.

Thanks. I know this is beyong the scope right now. Does xgboost_ray have a plan to support it later?

We do not provide an API to pass your own Actor class, so you'll have to most likely monkey-patch it.

Seems I need to replicate some functions similar like train() or predict() but using custom RayXGBoostActor? This requires me fully understand the codes in xgboost_ray and do you think there's a easier way to support my use case?

Yard1 commented 1 year ago

I think the train() and predict() methods of RayXGBoostActor are relatively straightforward and do not require knowledge of the entire xgboost-ray codebase. I do not believe there's an easier way.

We can add some extra developer APIs to make modifying the training/prediction behavior easier.

I'd be happy to schedule a chat to talk about this, if you think that'll be helpful! Please email me at antoni [at] anyscale.com