Token accuracy not as expected for starcoderbase-15b model with rolling batch type vllm

Description

Getting Junk values and number of tokens generated less for starcoderbase model with rolling batch type vllm And also accuracy of generated text is also low.

Expected Behavior

Number of Tokens need to be generated as expected and accuracy also need to be good .

How to Reproduce?

(If you developed your own code, please provide a short script that reproduces the error. For existing examples, please provide link.)

Steps to reproduce

(Paste the commands you ran that produced the error.)

Serving.properties

engine=Python option.entryPoint=djl_python.huggingface option.rolling_batch=vllm option.max_rolling_batch_size=32 option.output_formatter=json option.tensor_parallel_degree = 1

Dataset

Tested using code-contests datasets

deepjavalibrary / djl-serving