Open vvagias opened 2 months ago
can you provide the config file that you use? seems like a normal distribution with high variance
{
"backend": "cserve-debug",
"base_url": "http://a100-llama8b.user-1404.gcp.centml.org",
"endpoint": "/cserve/v1/generate",
"dataset_name": "random",
"input_token_distribution": ["normal", 20, 10],
"output_token_distribution": ["uniform", 200, 201],
"request_distribution": ["poisson", 5],
"num_of_req": 100,
"model": "meta-llama/Meta-Llama-3-8B-Instruct",
"tokenizer": "meta-llama/Meta-Llama-3-8B-Instruct",
"https_ssl": false,
"no_prefix": true
}
I see, the input token has normal dist with mean 20 and std 10, so it goes between 10 and 30, at the lower end there may be some issues, will take a look later, for now maybe try with lower std values like 2 or 5
just wanted to point this out. Occasionally you get the following error when using inference benchmark :
It seems to be in the random generation creating a negative number some times and truly is random because just running the command again will work perfect. Not sure if this is a bug or what. Just wanted to bring it up.