Enhance SDG to Support Multiple OpenAI Endpoints for Improved Performance

npalaska commented 1 month ago

Currently, SDG only supports a single OpenAI endpoint. However, adding support for multiple OpenAI endpoints could significantly improve overall SDG performance. We have observed nearly a 50% improvement in total SDG timing by running two replicas of the vLLM server instead of one and load balancing them internally.

Consider the following scenarios: Scenario 1

Teacher model sharded across 2 gpus -> endpoint A
Teacher model sharded across 2 gpus -> endpoint B

Scenario 2

Teacher model sharded across 4 gpus -> endpoint A

Running SDG with Scenario 1 showed nearly 50% improvement over Scenario 2. If SDG can work with multiple replicas of vLLM, we can incorporate Scenario 1 for better performance.

shivchander commented 1 month ago

@njhill would be good to have your thoughts on this

njhill commented 1 month ago

I think it's a good option to have in the toolbox for throughput-maximization experimentation. A wrapper client could be used which just wraps two different clients configured with different endpoints.

russellb commented 1 month ago

This seems like a pretty normal load balancer use case?

instructlab / sdg

Enhance SDG to Support Multiple OpenAI Endpoints for Improved Performance #216