microsoft / LLMLingua

[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
https://llmlingua.com/
MIT License
4.69k stars 262 forks source link

[Bug]: Promptflow tool fails with Run failed: KeyError: 0 #178

Open chris-chatsi opened 2 months ago

chris-chatsi commented 2 months ago

Describe the bug

I installed the custom tool into Azure Promptflow. I am using a llama-7b-text-generation MaaS running on Azure.

When testing my Promptflow, the first problem was that the torch library was not installed in the runtime environment. Once I installed that I received an error Run failed: KeyError: 0.

I grabbed the following requirements from the example found in Promptflow GitHub but still had no luck.

transformers>=4.26.0
accelerate
torch
tiktoken
nltk
numpy
llmlingua-promptflow

For now I am going to try to use the Please let me know any other information I can provide.

Steps to reproduce

  1. Start a MaaS model on Azure
  2. Install the tool as a custom tool in your compute instance.
  3. Install the requirements needed to run a simple example.
  4. Run an example with required inputs.

Expected Behavior

Receive an error with the traceback.

Logs

Run failed: KeyError: 0

Traceback (most recent call last):
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/promptflow/tracing/_trace.py", line 556, in wrapped
    output = func(*args, **kwargs)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/llmlingua_promptflow/tools/llmlingua.py", line 1613, in prompt_compress
    res = llm_lingua.compress_prompt(context=prompt, rate=rate, use_sentence_level_filter=False, use_context_level_filter=False)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/llmlingua_promptflow/tools/llmlingua.py", line 574, in compress_prompt
    context = self.trunk_token_compress(
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/llmlingua_promptflow/tools/llmlingua.py", line 1278, in trunk_token_compress
    compressed_input_ids = np.concatenate([self.api_results[id][0] for id in range(trunk_num)], axis=1)
  File "/azureml-envs/prompt-flow/runtime/lib/python3.9/site-packages/llmlingua_promptflow/tools/llmlingua.py", line 1278, in <listcomp>
    compressed_input_ids = np.concatenate([self.api_results[id][0] for id in range(trunk_num)], axis=1)
KeyError: 0

Additional Information

No response

SiyunZhao commented 1 month ago

Hi @chris-chatsi , thanks for your feedback. The KeyError occurred because the API did not return logits. Can you test your MaaS model with the following code to check whether you can print logits of the prompt ? You only need to replace the value of your_api_endpoint and your_api_key.

import urllib.request
import json

your_api_endpoint = ''
your_api_key = ''

def test_custom_connection(api_url, api_key, prompt='hello hello hello world.'):
    data =  {
    "prompt": prompt,
    "temperature": 0,
    "max_tokens": 0,
    "echo": "True",
    "logprobs": 0
    }
    body = str.encode(json.dumps(data))
    req = urllib.request.Request(api_url, body, {'Content-Type':'application/json', 'Authorization':('Bearer '+ api_key)})
    res = urllib.request.urlopen(req).read()
    res = json.loads(res)

    logits = - np.array(res['choices'][0]['logprobs']['token_logprobs'])
    print(logits.shape)
    return logits

print(test_custom_connection(your_api_endpoint, your_api_key))