sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.
Apache License 2.0
4.85k stars 327 forks source link

`model_override_args` with server #591

Open ValeKnappich opened 2 months ago

ValeKnappich commented 2 months ago

When using a server, one currently cannot use the model_overide_args which could be very useful, e.g. for rope scaling.

This is currently the sglang.launch_server.py:

import argparse

from sglang.srt.server import launch_server
from sglang.srt.server_args import ServerArgs

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    ServerArgs.add_cli_args(parser)
    args = parser.parse_args()
    server_args = ServerArgs.from_cli_args(args)

    launch_server(server_args, None)

The model_overide_args would be the third argument to launch_server defaulting to None. Adding a small cli parser that allows arbitrary model args would be great, e.g.

python -m sglang.launch_server --model_overide_args.rope_scaling.factor 2 --model_overide_args.rope_scaling.type linear
merrymercy commented 1 month ago

@ValeKnappich Agree. Could you send a pull request to support this feature?