vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
31.05k stars 4.72k forks source link

[Bug] params Type is not right? #9835

Open cqray1990 opened 1 month ago

cqray1990 commented 1 month ago

Your current environment

vllm 0.6.3

🐛 Describe the bug

here, params=SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.25, temperature=0.7, top_p=0.8, top_k=-1, min_p=0.0, seed=None, stop=[], stop_token_ids=[151645], include_stop_str_in_output=False, ignore_eos=False, max_tokens=2048, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None), guided_decoding=None

but when excute the follows code, it raise error: "Either SamplingParams or PoolingParams must be provided

                                 # Create a SequenceGroup based on SamplingParams or PoolingParams
                                if isinstance(params, SamplingParams):
                                            seq_group = self._create_sequence_group_with_sampling(
                                                request_id,
                                                seq,
                                                params,
                                                arrival_time=arrival_time,
                                                lora_request=lora_request,
                                                trace_headers=trace_headers,
                                                prompt_adapter_request=prompt_adapter_request,
                                                encoder_seq=encoder_seq,
                                                priority=priority)
                                elif isinstance(params, PoolingParams):
                                                seq_group = self._create_sequence_group_with_pooling(
                                                    request_id,
                                                    seq,
                                                    params,
                                                    arrival_time=arrival_time,
                                                    lora_request=lora_request,
                                                    prompt_adapter_request=prompt_adapter_request,
                                                    encoder_seq=encoder_seq,
                                                    priority=priority)
                                else:
                                            raise ValueError(
                                                "Either SamplingParams or PoolingParams must be provided.")

Before submitting a new issue...

DarkLight1337 commented 1 month ago

Can you show the code which you used to pass sampling params?