Closed ksrinivs64 closed 2 weeks ago
Yes, it supports batched inference
Can you point to how? The examples seem to show a single prompt and there seems to be a call to reset for each prompt. Thanks again.
On Thu, Nov 7, 2024, 10:27 PM Shubham Ugare @.***> wrote:
Yes, it supports batched inference
— Reply to this email directly, view it on GitHub https://github.com/uiuc-focal-lab/syncode/issues/124#issuecomment-2463684527, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACNS6QV7TBALNW3UE4W6PBTZ7QVRVAVCNFSM6AAAAABRL5POW6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINRTGY4DINJSG4 . You are receiving this because you authored the thread.Message ID: @.***>
To clarify, the reset for each prompt seems to happen in the code to generate per prompt.
You can run something like following for batched inference:
syn_llm = Syncode(model=model_name, grammar='json', parse_output_only=True, max_new_tokens=50, num_return_sequences=5, do_sample=True, temperature=0.7)
prompt = "Please return a json object to represent country India with name, capital and population?"
output = syn_llm.infer(prompt)
for i, out in enumerate(output):
out = out.strip()
print(f"SynCode output {i+1}:\n{out}\n")
Hi, thanks for a very nice library. Do you support batched inference? Thanks