modal-labs / llm-finetuning

Guide for fine-tuning Llama/Mistral/CodeLlama models and more
MIT License
538 stars 84 forks source link

How do you call an inference from Python? #77

Closed tonghuikang closed 3 months ago

tonghuikang commented 3 months ago

Is something like

inference_main("axo-2024-08-11-19-48-32-b6b4", "[INST]This is the prompt[/INST]")

enough?

Or do I need to go a roundabout method of making a REST request?

mwaskom commented 3 months ago

See our docs on invoking deployed functions here: https://modal.com/docs/guide/trigger-deployed-functions#function-lookup-and-invocation-basics

PS you may want to reach out for help on our Slack — you'll likely get a faster response as it is monitored much more closely than this issue tracker.