Closed dsikka closed 5 months ago
in between some concurrent work between deepsparse and lm-eval to test the openai server, i ran into an issue that top_logprobs
isn’t implemented
here is the vLLM create_logprobs function that i tried hacking into your diff but got stuck without having the top_logprobs input: https://github.com/vllm-project/vllm/blob/827cbcd37c464452b79956fa4a564199e6c0ab6a/vllm/entrypoints/openai/api_server.py#L208-L240C20
here is the request, response, and traceback i got within lm-eval-harness. commands:
deepsparse.server --integration openai --task text-generation --model_path hf:mgoin/TinyStories-1M-ds
lm_eval --model local-completions --model_args base_url=http://localhost:5543/v1,model=hf:mgoin/TinyStories-1M-ds,tokenizer_backend=huggingface,tokenizer=mgoin/TinyStories-1M-ds --tasks hellaswag --num_fewshot 0
REQUEST = {'model': 'hf:mgoin/TinyStories-1M-ds', 'prompt': [[41183, 290, 14620, 25, 1374, 284, 910, 23748, 287, 410, 1155, 22678, 13, 13816, 366, 2124, 259, 442, 24247, 78, 366, 355, 257, 2276, 31933, 13, 1002, 345, 691, 2193, 530, 410, 1155, 22678, 31933, 11, 366, 2124, 259, 442, 24247, 78, 366, 561, 1884, 307, 262, 1266, 31933, 284, 3853, 13, 350, 1313, 8652, 366, 2124, 259, 442, 24247, 78, 366, 355, 25, 7813, 474, 322, 262, 1573, 366, 442, 24247, 78, 366, 1724, 366, 23748, 366, 287, 46932, 11, 475, 345, 561, 8365, 779, 340, 3436, 13, 220, 17106, 11, 12581, 6428, 366, 627, 2634, 366, 1724, 366, 23748, 366, 287, 410, 1155, 22678, 11, 475, 345, 743, 635, 779, 340, 355, 366, 627, 2634, 13, 366, 329, 46932, 11636, 11, 340, 318, 16293, 366, 1575, 64, 2356, 64, 442, 24247, 78, 12, 421, 2634, 13]], 'echo': True, 'max_tokens': 0, 'temperature': 0.0, 'logprobs': 10, 'seed': 1234}
2024-01-12:21:52:50,368 INFO [_client.py:1027] HTTP Request: POST http://localhost:5543/v1/completions "HTTP/1.1 200 OK"
RESPONSE = CompletionChoice(finish_reason='stop', index=None, logprobs=Logprobs(text_offset=[], token_logprobs=[-1.3718655109405518, -1.6813075542449951, -0.5835205912590027, -2.3857421875, -1.1850147247314453, -1.4327609539031982, -1.0862425565719604, -1.3265399932861328, -0.01401725597679615, -0.7671912908554077, -1.8408502340316772, -0.02035493403673172, -0.805853545665741, -1.3963234424591064, -0.6728101372718811, -0.28462377190589905, -0.18176312744617462, -1.6125370264053345, -0.8710795640945435, -1.1399714946746826, -0.4823414087295532, -1.6222418546676636, -1.6073075532913208, -0.9253568649291992, -0.28968146443367004, -2.509056568145752, -1.207929015159607, -1.9377094507217407, -0.39531823992729187, -0.5303062200546265, -1.6694589853286743, -0.24369390308856964, -1.382930040359497, -0.42184001207351685, -2.2233667373657227, -0.8875962495803833, -1.6202012300491333, -1.9755644798278809, -2.335638999938965, -1.3909152746200562, -2.493443489074707, -0.8141875267028809, -1.8022541999816895, -1.3976320028305054, -2.528510570526123, -0.4395048916339874, -0.8453584313392639, -0.008289567194879055, -0.012714286334812641, -1.3391444683074951, -0.009850701317191124, -1.1681102514266968, -1.4828898906707764, -1.0213128328323364, -0.8255200982093811, -1.1556187868118286, -2.5673587322235107, -0.07395735383033752, -2.212928295135498, -0.856827974319458, -1.0892446041107178, -0.6976301074028015, -0.41025859117507935, -2.5871167182922363, -0.6402055025100708, -1.857731580734253, -0.3513781726360321, -0.28716686367988586, -0.880348801612854, -2.105755090713501, -0.255537211894989, -2.720140218734741, -0.19674457609653473, -1.0769606828689575, -1.0204941034317017, -0.6762701869010925, -0.1618189811706543, -1.2297614812850952, -1.4207383394241333, -1.2299003601074219, -0.49282366037368774, -1.246532917022705, -1.6099696159362793, -2.2601776123046875, -1.465539813041687, -0.001021907082758844, -2.3649849891662598, -0.3639565706253052, -0.93106609582901, -0.9325934052467346, -1.520829439163208, -0.8304876089096069, -0.01390259712934494, -0.010546673089265823, -1.4068495035171509, -0.8410376310348511, -0.032891422510147095, -0.011724268086254597], tokens=[' You', ' can', "'t", ' take', ' it', ' away', '".', ' ', '\n', '\n', 'Sam', 'my', ' was', ' sad', ',', ' but', ' he', ' knew', ' he', ' had', ' to', ' be', ' careful', '.', ' He', ' said', ' "', 'No', ',', ' I', ' can', "'t", ' do', ' it', '.', ' I', ' will', ' be', ' careful', ' and', ' try', ' to', ' make', ' it', ' better', '."', ' ', '\n', '\n', 'Sam', 'my', ' was', ' very', ' sad', ' and', ' he', ' decided', ' to', ' take', ' a', ' break', '.', ' He', ' put', ' the', ' band', 'age', ' on', ' the', ' floor', ' and', ' put', ' it', ' in', ' his', ' pocket', '.', ' He', ' was', ' so', ' happy', ' and', ' he', ' was', ' able', ' to', ' play', ' with', ' the', ' band', '.', ' ', '\n', '\n', 'The', ' end', '.', '\n'], top_logprobs=None), text=' You can\'t take it away". \n\nSammy was sad, but he knew he had to be careful. He said "No, I can\'t do it. I will be careful and try to make it better." \n\nSammy was very sad and he decided to take a break. He put the bandage on the floor and put it in his pocket. He was so happy and he was able to play with the band. \n\nThe end.\n')
Traceback (most recent call last):
File "/home/mgoin/venvs/clip-ret/bin/lm_eval", line 8, in <module>
sys.exit(cli_evaluate())
File "/home/mgoin/code/lm-evaluation-harness-mgoin/lm_eval/__main__.py", line 231, in cli_evaluate
results = evaluator.simple_evaluate(
File "/home/mgoin/code/lm-evaluation-harness-mgoin/lm_eval/utils.py", line 415, in _wrapper
return fn(*args, **kwargs)
File "/home/mgoin/code/lm-evaluation-harness-mgoin/lm_eval/evaluator.py", line 150, in simple_evaluate
results = evaluate(
File "/home/mgoin/code/lm-evaluation-harness-mgoin/lm_eval/utils.py", line 415, in _wrapper
return fn(*args, **kwargs)
File "/home/mgoin/code/lm-evaluation-harness-mgoin/lm_eval/evaluator.py", line 325, in evaluate
resps = getattr(lm, reqtype)(cloned_reqs)
File "/home/mgoin/code/lm-evaluation-harness-mgoin/lm_eval/models/openai_completions.py", line 201, in loglikelihood
return self._loglikelihood_tokens(new_reqs)
File "/home/mgoin/code/lm-evaluation-harness-mgoin/lm_eval/models/openai_completions.py", line 248, in _loglikelihood_tokens
answer = get_result(resp, ctxlen)
File "/home/mgoin/code/lm-evaluation-harness-mgoin/lm_eval/models/openai_completions.py", line 35, in get_result
top_tokens = response.logprobs.top_logprobs[i]
TypeError: 'NoneType' object is not subscriptable
This is ready for merge apart from support for the field top_logprob
.
Need clarification on what we're actually returning for this field
@mgoin When you're happy with this and it passes any tests you have, let me know and we can merge this in. Also, if the tests you're running locally are worth adding to the openai_server tests, I can do that as well as part of this PR
Summary
logprob
supportLogProbs
pydantic model, which can now be returned for thecompletions
endpointTesting / Question?
token_ids
, just the final sequence. So this ends up going through the steps of tokenizing the output to get thetoken_ids
(as needed by theLogProbs
) which seems redundant to do as we have them from the pipeline, but it is not part of the output --> we should make it as part of the output to clean this up if there are not strong opinions against it?Local Testing
Client Code:
Output: