Assertion failed: noRepeatNgramSize.value() > 0

System Info

GPU-A100, TensorRT-LLM version = tensorrt_llm-0.13.0.dev2024090300 Ubuntu machine.

Who can help?

hi @ncomly-nvidia , @byshiue ,

I want to set the 'no_repeat_ngram_size'=0 for mistral model. But I get the following assertion error:

RuntimeError: [TensorRT-LLM][ERROR] Assertion failed: noRepeatNgramSize.value() > 0 (/home/jenkins/agent/workspace/LLM/main/L0_PostMerge/llm/cpp/tensorrt_llm/executor/samplingConfig.cpp:332)

As per the documentation the default value is 1 << 30, is there way to set the value to 0? If not, can this feature be added?

Information

[ ] The official example scripts
[x] My own modified scripts

Tasks

[ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[x] My own task or dataset (give details below)

Reproduction

Setting no_repear_ngram_size=0 under SamplingParams for mistral model.

Expected behavior

User should be allowed to allowed to set this value to 0.

actual behavior

Getting assertion error.

additional notes

We want to set it to 0 like we do for pytorch-eager used for inference.

NVIDIA / TensorRT-LLM