OpenLLMAI / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)
https://openrlhf.readthedocs.io/
Apache License 2.0
1.71k stars 160 forks source link

Avoid monkey patching vLLM #297

Open Atry opened 1 month ago

Atry commented 1 month ago

Currently, vLLM's vllm.worker.worker.Worker is replaced with openrlhf.trainer.ray.vllm_worker_wrap.WorkerWrap on fly as a monkey patch.

The monkey patch is avoidable by making init_process_group and update_weight global functions and invoke them via __ray_call__.

__ray_call__ has not been documented yet but it is expected to be documented soon since it is marked as P1 in https://github.com/ray-project/ray/issues/45068.

__ray_call__ is already used in Ray to initialize NCCL group, which is similar to the OpenRLHF use case:

https://github.com/ray-project/ray/blob/4c1519be13087f4ccb47431f9ebc3dc446182775/python/ray/experimental/channel/torch_tensor_nccl_channel.py#L275-L282

hijkzzz commented 1 month ago

Thanks~