sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.
https://sglang.readthedocs.io/en/latest/
Apache License 2.0
5.53k stars 412 forks source link

ImportError: cannot import name 'get_cuda_stream' from 'triton.runtime.jit' In triton-nightly(V100) #383

Closed nenomigami closed 3 weeks ago

nenomigami commented 5 months ago

Description

Encountered an ImportError when attempting to start a project using triton-nightly on a V100 GPU. The issue seems to stem from an inability to import get_cuda_stream from triton.runtime.jit when loading Command-R. This error interrupts the server initialization process, preventing further progress. The issue does not occur when reverting to a version prior to the addition of command R.

Environment

  File "/mnt/usr/study/sglang2/python/sglang/srt/models/commandr.py", line 54, in <module>
    def layer_norm_func(hidden_states, weight, variance_epsilon):
  File "/mnt/tmp/sglang/lib/python3.10/site-packages/torch/__init__.py", line 1723, in compile
    return torch._dynamo.optimize(backend=backend, nopython=fullgraph, dynamic=dynamic, disable=disable)(model)
  File "/mnt/tmp/sglang/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 610, in optimize
    compiler_config=backend.get_compiler_config()
  File "/mnt/tmp/sglang/lib/python3.10/site-packages/torch/__init__.py", line 1571, in get_compiler_config
    from torch._inductor.compile_fx import get_patched_config_dict
  File "/mnt/tmp/sglang/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 38, in <module>
    from .fx_passes.joint_graph import joint_graph_passes
  File "/mnt/tmp/sglang/lib/python3.10/site-packages/torch/_inductor/fx_passes/joint_graph.py", line 8, in <module>
    from ..pattern_matcher import (
  File "/mnt/tmp/sglang/lib/python3.10/site-packages/torch/_inductor/pattern_matcher.py", line 28, in <module>
    from .lowering import fallback_node_due_to_unsupported_type
  File "/mnt/tmp/sglang/lib/python3.10/site-packages/torch/_inductor/lowering.py", line 4768, in <module>
    import_submodule(kernel)
  File "/mnt/tmp/sglang/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 1492, in import_submodule
    importlib.import_module(f"{mod.__name__}.{filename[:-3]}")
  File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/mnt/tmp/sglang/lib/python3.10/site-packages/torch/_inductor/kernel/bmm.py", line 4, in <module>
    from ..select_algorithm import (
  File "/mnt/tmp/sglang/lib/python3.10/site-packages/torch/_inductor/select_algorithm.py", line 25, in <module>
    from .codegen.triton import texpr, TritonKernel, TritonPrinter, TritonScheduling
  File "/mnt/tmp/sglang/lib/python3.10/site-packages/torch/_inductor/codegen/triton.py", line 26, in <module>
    from ..triton_heuristics import AutotuneHint
  File "/mnt/tmp/sglang/lib/python3.10/site-packages/torch/_inductor/triton_heuristics.py", line 43, in <module>
    from triton.runtime.jit import get_cuda_stream, KernelInterface
ImportError: cannot import name 'get_cuda_stream' from 'triton.runtime.jit' (/mnt/tmp/sglang/lib/python3.10/site-packages/triton/runtime/jit.py)
github-actions[bot] commented 3 weeks ago

This issue has been automatically closed due to inactivity. Please feel free to reopen it if needed.