sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.
https://sglang.readthedocs.io/en/latest/
Apache License 2.0
5.52k stars 411 forks source link

Pydantic>2 causes issues loading models #126

Closed nivibilla closed 2 months ago

nivibilla commented 8 months ago

Hi,

Related to this issue,

By default, since the pydantic version is not pinned, greater than version 2 is used which is causing issues with loading.

However pip install pydantic==1.10.14 resolves this issue and I can load models normally now.

comaniac commented 8 months ago

I'm aware of this issue and plan to upgrade to pydantic 2.0 in these days.

comaniac commented 8 months ago

btw could you post the error messages here to help locate the issue? Thanks.

nivibilla commented 8 months ago

Failed to import transformers.generation.utils because of the following error (look up to see its traceback): 'FieldInfo' object has no attribute 'required'

comaniac commented 8 months ago

Just found that this is an issue from transformers: https://github.com/huggingface/transformers/issues/27273. It means transformers depends on Pydantic 1.x. However, a recent PR https://github.com/huggingface/transformers/pull/28728 removes the Pydantic 1.x and has been merged without issues. It means at some point transformers already support Pydantic 2.x, although I cannot locate the exact PR that fixes this issue. Could you check the version of your transformers that causes the problem, and try to upgrade it to see if that helps?

cc @merrymercy @Ying1123 @hnyls2002 @rlouf you may be interested in this issue as well.

nivibilla commented 8 months ago

Hmm, I didn't manually install transformers. It was installed in the process of installing sglang. Maybe the dependency on pydantic<2 hasn't made it to pip yet.

comaniac commented 8 months ago

transformers is not specified in sglang's installation dependency, so it might be installed due to one of its other dependencies. You could still look at the version and try to manually upgrade it just for testing.

rlouf commented 8 months ago

It's strange, I have never had a problem with Outlines.

fozziethebeat commented 8 months ago

I found another issue regarding pydantic 2.0 that happens unpredictably.

I built a docker image and when trying to fetch streaming results with the OpenAI endpoint I got this error:

    |   File "/usr/local/lib/python3.9/dist-packages/sglang/srt/server.py", line 265, in gnerate_stream_resp
    |     yield f"data: {chunk.json(exclude_unset=True, ensure_ascii=False)}\n\n"
    |   File "/usr/local/lib/python3.9/dist-packages/pydantic/main.py", line 1056, in json
    |     raise TypeError('`dumps_kwargs` keyword arguments are no longer supported.')
    | TypeError: `dumps_kwargs` keyword arguments are no longer supported.

This is due to features not supported past Pydantic 2.0. Doing some checks, it seems I installed a few not-compatible versions of key libraries:

vllm==0.3.0
pydantic==2.6.0
pydantic_core==2.16.1

I built the image via this Dockerfile setup:

FROM nvidia/cuda:12.2.0-devel-ubuntu20.04

ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update -y && apt-get install -y python3.9 python3.9-distutils python3.9-dev curl git
RUN curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
RUN python3.9 get-pip.py

WORKDIR /app

RUN --mount=type=cache,target=/root/.cache pip3 install "sglang[all]"

I did confirm that pydantic==1.10.13 resolves these issues, that just happens to not be what got installed with my docker setup.

comaniac commented 8 months ago

Yes I got this issue too. I probably should firstly upgrade to Pydantic v2 (I was waiting for the outlines integration...)

hnyls2002 commented 8 months ago

Hi, outlines are now imported as a dependency(#168). We now specify the outlines>=0.0.27 so that pydantic>2 is compatible with both outlines and vllm. Please let me know if there are any problems with the outlines or the dependency version.

fozziethebeat commented 7 months ago

I tried using the outlines feature and I think someone (either outlines themselves or sglang) needs to add a datasets dependency. Just doing a simple

pip install "sglang[all]"

And then trying to do the regex constrained decoding triggered a failure. Installing datasets fixed it.

rlouf commented 7 months ago

Could you paste the stack trace here?

fozziethebeat commented 7 months ago

Well, seems I can't re-create it anymore.

I am however seeing a separate problem. When running LLaVa 1.6 via

python -m sglang.launch_server --model-path liuhaotian/llava-v1.6-vicuna-7b --toke
nizer-path SurfaceData/llava-v1.6-vicuna-7b-hf --chat-template vicuna_v1.1 --port 30000

And then running the driver_pydantic_wizard_gen method in sglang/examples/usage/json_decode.py, I get the following horrific stack trace:

../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [35,0,0], thread: [102,0,0] Asser
tion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
Exception in ModelRpcClient:
Traceback (most recent call last):
  File "/home/fozziethebeat/devel/sglang/python/sglang/srt/managers/router/model_rpc.py", line 176,
in exposed_step
    self.forward_step()
  File "/home/fozziethebeat/anaconda3/envs/sglang/lib/python3.10/site-packages/torch/utils/_contextl
ib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/fozziethebeat/devel/sglang/python/sglang/srt/managers/router/model_rpc.py", line 203,
in forward_step
    self.forward_decode_batch(self.running_batch)
  File "/home/fozziethebeat/devel/sglang/python/sglang/srt/managers/router/model_rpc.py", line 491,
in forward_decode_batch
    next_token_ids, _ = batch.sample(logits)
  File "/home/fozziethebeat/devel/sglang/python/sglang/srt/managers/router/infer_batch.py", line 472
, in sample
    sampled_index = torch.multinomial(probs_sort, num_samples=1)
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below
might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

This repeats infinitely until I kill the server.

nivibilla commented 7 months ago

When installing from source, im still getting this error

Failed to import transformers.generation.utils because of the following error (look up to see its traceback): 'FieldInfo' object has no attribute 'required'

downgrading pydantic does fix it again

github-actions[bot] commented 2 months ago

This issue has been automatically closed due to inactivity. Please feel free to reopen it if needed.