sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.
https://sgl-project.github.io/
Apache License 2.0
5.9k stars 478 forks source link

Cannot Execute Runtime Directly in Docker, with local install #274

Closed lucasavila00 closed 7 months ago

lucasavila00 commented 7 months ago

I'm running the runtime directly, like so:

SGLANG_PORT, additional_ports = handle_port_init(30000, None, 1)
RUNTIME = sgl.Runtime(
    model_path=model_path,
    port=SGLANG_PORT,
    additional_ports=additional_ports,
    model_mode=[] if os.environ.get("DISABLE_FLASH_INFER") == "yes" else ["flashinfer"],
)
print(f"Initialized SGLang runtime: {RUNTIME.url}")

But after upgrading from 0.1.12 to latest commit I get this error:

Process Process-1:1:
router init state: Traceback (most recent call last):
  File "/sglang/python/sglang/srt/managers/router/manager.py", line 68, in start_router_process
    model_client = ModelRpcClient(server_args, port_args)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/sglang/python/sglang/srt/managers/router/model_rpc.py", line 612, in __init__
    self.model_server.exposed_init_model(0, server_args, port_args)
  File "/sglang/python/sglang/srt/managers/router/model_rpc.py", line 62, in exposed_init_model
    self.model_runner = ModelRunner(
                        ^^^^^^^^^^^^
  File "/sglang/python/sglang/srt/managers/router/model_runner.py", line 275, in __init__
    self.load_model()
  File "/sglang/python/sglang/srt/managers/router/model_runner.py", line 284, in load_model
    model_class = get_model_cls_by_arch_name(architectures)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/sglang/python/sglang/srt/managers/router/model_runner.py", line 41, in get_model_cls_by_arch_name
    model_arch_name_to_cls = import_model_classes()
                             ^^^^^^^^^^^^^^^^^^^^^^
  File "/sglang/python/sglang/srt/managers/router/model_runner.py", line 33, in import_model_classes
    for module_path in (Path(sglang.__file__).parent / "srt" / "models").glob("*.py"):
                        ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/pathlib.py", line 871, in __new__
    self = cls._from_parts(args)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/pathlib.py", line 509, in _from_parts
    drv, root, parts = self._parse_args(args)
                       ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/pathlib.py", line 493, in _parse_args
    a = os.fspath(a)
        ^^^^^^^^^^^^
TypeError: expected str, bytes or os.PathLike object, not NoneType

detoken init state: init ok

Traceback (most recent call last):
  File "/handler.py", line 32, in <module>
    RUNTIME = sgl.Runtime(
              ^^
^^^
^^^
^^^
^

  File "/sglang/python/sglang/api.py", line 40, in Runtime
    return Runtime(*args, **kwargs)

 ^^^^^^^
^^^^
^^^^^^^
^^^
^^
^
  File "/sglang/python/sglang/srt/server.py", line 598, in __init__
    raise RuntimeError("Launch failed. Please see the error messages above.")
RuntimeError: Launch failed. Please see the error messages above.

I fixed it by monkey-patching this field with the required path

import sglang

sglang.__file__ = "/sglang/python/sglang/srt"

For context, this code runs within the runpod serverless runtime and the full docker-image is available here https://github.com/lucasavila00/LmScript/tree/main/runpod-serverless

Qubitium commented 7 months ago

Check your Runtime args. Pretty sure model_mode is fully deprecated.

lucasavila00 commented 7 months ago

@Qubitium thanks. I have updated https://github.com/lucasavila00/LmScript/tree/main/runpod-serverless already to use the a recent commit b2eb080501b4b4a0d72eb5a0e6be30d43811dcbd and fixed the model_mode usage.

I haven't tried running sglang without the work around on the recent commit. I'll do it later and I'll report the results here

lucasavila00 commented 7 months ago

I investigated it and the issue only happens if using docker, and if installing an editable and local installation pip install -e.

I have a minimal reproduction here https://github.com/lucasavila00/sglang-repro-docker you can clone and docker-compose up --build to reproduce it.

I think __file__ usage should be avoided - other projects are avoiding it https://github.com/scikit-learn/scikit-learn/issues/20081

Also, I wouldn't be surprised if the issue reported here https://github.com/sgl-project/sglang/pull/242#issuecomment-1968250092 was this same issue, given that both dockerfiles are installing it with pip install -e.

Qubitium commented 7 months ago

@lucasavila00 Please use this PR https://github.com/sgl-project/sglang/pull/288 and confirm it fixed your docker env issue due to file usage

lucasavila00 commented 7 months ago

288 fixes it on my end