vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
31.02k stars 4.72k forks source link

[Usage]: extractive question answering using VLLM #5126

Closed suryavan11 closed 2 days ago

suryavan11 commented 6 months ago

Your current environment

vllm==0.2.7

How would you like to use vllm

Is extractive question answering possible with VLLM batched inference? Here is an example: https://yonigottesman.github.io/2023/08/10/extractive-generative.html . I have seen logits_processor field in SampleParams in vllm 0.2.7 but I am not sure how to set it up to choose tokens only from the context for each prompt

github-actions[bot] commented 1 month ago

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!

github-actions[bot] commented 2 days ago

This issue has been automatically closed due to inactivity. Please feel free to reopen if you feel it is still relevant. Thank you!