vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
30.92k stars 4.7k forks source link

[Usage]: Wait for the response for each prediction #7741

Open savi8sant8s opened 3 months ago

savi8sant8s commented 3 months ago

How would you like to use vllm

I would like to wait for each response using vllm because I will use the previous predictions to complement the next ones. However, I don't know how to do this with vllm.

Code

import pandas as pd
from vllm import LLM, SamplingParams
from vllm.lora.request import LoRARequest

llm = LLM(
    model="maritaca-ai/sabia-7b",
    enable_lora=True, 
    max_model_len=256,
    gpu_memory_utilization=0.95,
    enforce_eager=True,
)

sampling_params = SamplingParams(
    temperature=0.001,
    max_tokens=256
)

df = pd.read_csv("prompts_bluche_test.csv")

prev_5_words = ''
next_5_words = ''
last_filename_prefix = ''

prompts = []

for index, row in df.iterrows():
    filename_prefix = row['filename'][:8]
    next_filename_prefix = df.iloc[index+1]['filename'][:8] if index < len(df)-1 else ''
    if (last_filename_prefix == '' or filename_prefix == next_filename_prefix):
        next_5_words = row['next_5_words']
    else:
        next_5_words = ''

    prompt = f"""
        ### Instrução: Corrija os erros pós-OCR presentes na linha.
        ### Palavras anteriores: {prev_5_words}
        ### Linha a corrigir: {row['input']}
        ### Palavras seguintes: {next_5_words}
        ### Resposta:
    """

    output = llm.generate(
        prompt,
        sampling_params,
        lora_request=LoRARequest("spelling", 1, "results/api_experiment_run/model/model_weights")
    )
    generated_text = output.outputs[0].text
    prev_5_words = " ".join(generated_text.split()[-5:])
    if (last_filename_prefix == '' or filename_prefix == last_filename_prefix):
        prev_5_words = " ".join(df.iloc[index-1]['output'].split()[-5:])
    else:
        prev_5_words = ''

    last_filename_prefix = filename_prefix
    prompts.append(generated_text)

predictions = pd.DataFrame(prompts, columns=['prediction'])

predictions.to_csv("predictions_sabia_bluche.csv", index=False)
github-actions[bot] commented 1 week ago

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!