PygmalionAI / aphrodite-engine

PygmalionAI's large-scale inference engine
https://pygmalion.chat
GNU Affero General Public License v3.0
660 stars 80 forks source link

Error when `top_logprobs` value is `-inf` #183

Closed miku448 closed 3 months ago

miku448 commented 5 months ago

So, in some cases, some values in the top_logprobs are -inf.

The request completes fine; but it seems like when it tries to build the output JSON, it produces the following error:

INFO 12-22 15:32:55 async_aphrodite.py:110] Finished request cmpl-b98ecf403c8848a4aad44d4f537cfcf8.
INFO:     127.0.0.1:32890 - "POST /v1/completions HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
//...
ValueError: Out of range float values are not JSON compliant

So, I went to this line: https://github.com/PygmalionAI/aphrodite-engine/blob/main/aphrodite/endpoints/openai/api_server.py#L234

And added this:

print(logprobs)
# for min value of each top_logprobs to -1000
logprobs.top_logprobs = [
  {k: v if v > -1000 else -1000 for k, v in top_logprob.items()}
  for top_logprob in logprobs.top_logprobs
]
return logprobs

this makes it work and that print(logprobs) returns:

text_offset=[0] token_logprobs=[0.0] tokens=['▁anno'] top_logprobs=[{'▁anno': 0.0, '<s>': -inf, '<0x00>': -inf, '<0x03>': -inf, '<0x04>': -inf, '<unk>': -inf, '</s>': -inf, '<0x02>': -inf, '<0x05>': -inf, '<0x01>': -inf}]

Are these -inf expected? Are they the problem that the JSON doesn't output correctly?

@AlpinDale if you want I can open a PR with that small hotfix, but for sure there's a more elegant solution.

AlpinDale commented 5 months ago

In what scenario are you getting -inf as the top logprob? That's really weird

miku448 commented 5 months ago

I was sending logit_bias and thought it might be the issue, but I removed it and the error persists

{'301': 100.0, '528': 100.0, '626': 100.0, '766': 100.0, '885': 100.0, '1153': 100.0, '1424': 100.0, '1472': 100.0, '1999': 100.0, '4966': 100.0, '9613': 100.0, '9796': 100.0, '11158': 100.0, '12327': 100.0, '12758': 100.0, '14610': 100.0, '15377': 100.0, '18782': 100.0, '19253': 100.0, '21104': 100.0, '21620': 100.0, '22314': 100.0, '23407': 100.0, '23451': 100.0, '24173': 100.0, '25945': 100.0, '26230': 100.0, '27719': 100.0}

These are my SampingParams:

SamplingParams(n=1, best_of=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.17, temperature=1.31, top_p=0.14, top_k=49, top_a=0.52, min_p=0.0, tfs=1.0, eta_cutoff=10.42, epsilon_cutoff=1.49, typical_p=1.0, mirostat_mode=0, mirostat_tau=5.0, mirostat_eta=0.1, use_beam_search=False, length_penalty=1.0, early_stopping=False, stop=['\n###', '</s>', '<|', '\n#', '\n\n\n'], stop_token_ids=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=1, custom_token_bans=[], logprobs=10, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True)

I'm using TheBloke/MythoMax-L2-Kimiko-v2-13B-GPTQ for this case I can debug further until I find why this happens.

ewof commented 5 months ago

i get this with miku and tsukasa, mikus hotfix fixed for me

AlpinDale commented 4 months ago

Sorry I forgot to get back to you on this. Yes, AFAIK, JSON doesn't support -inf as a numeric value. If you can open a PR to fix that, it'd be great.

And yes, -inf value is expected; happens when the probs for a token is effectively zero. The log of zero is -inf. Your workaround of setting -inf to -1000 should be fine.