Does FLAML support distributed models such as pytorch ddp or distributed xgb/lgbm?

luoguohao commented 2 years ago

As I noticed that flaml integrating Ray to support distributed hyperparamerters tuning, but what if the model need to be distributed, how it works on FLAML or it is not supported yet ?

luoguohao commented 2 years ago

my mistake, i just learn that ray already supported DDP and xgb. but as we already using kubeflow training operators for pytorch and xgb，it is possible to integrate with kubeflow directly instead of ray?

sonichi commented 2 years ago

my mistake, i just learn that ray already supported DDP and xgb. but as we already using kubeflow training operators for pytorch and xgb，it is possible to integrate with kubeflow directly instead of ray?

Yes. You can modify https://github.com/microsoft/FLAML/blob/46f80dfa16a1396027a2f8fea2460f5795e3ee20/test/tune_example.py#L21 For example, you can replace https://github.com/microsoft/FLAML/blob/46f80dfa16a1396027a2f8fea2460f5795e3ee20/test/tune_example.py#L30 with your distributed training code.

microsoft / FLAML

Does FLAML support distributed models such as pytorch ddp or distributed xgb/lgbm? #519