triton-inference-server / client

Triton Python, C++ and Java client libraries, and GRPC-generated client examples for go, java and scala.
BSD 3-Clause "New" or "Revised" License
527 stars 225 forks source link

add --input-file support with new prompt source inference logic #628

Closed debermudez closed 2 months ago

debermudez commented 2 months ago

sample prompt file: Screenshot 2024-05-03 at 16 03 05

llm_inputs.json when using an input file: Screenshot 2024-05-03 at 16 03 20

nv-hwoo commented 2 months ago

Great work 👍 This seems like a useful feature for the users. One thing I would add is a small test that checks if the prompt gets properly set to the input json.