Closed simonw closed 1 year ago
I'm going to implement this as a flag you pass to llm replicate add
:
llm replicate add a16z-infra/llama13b-v2-chat --alias llama2 --chat
The --chat
option will be recorded in models.json
and will cause it to use the User: ...\nAssistant:
prompt format.
I'll add a way to set a custom prompt format too, but not for the first release of this.
I can do a variant on this test:
Using this trick:
(Pdb) mock_client.run.call_args_list
[call('replicate/flan-t5-xl:7a216605843d87f5426a10d2cc6940485a232336ed04d655ef86b91e020e9210', input={'prompt': 'say hi'})]
(Pdb) mock_client.run.call_args_list[0].args, mock_client.run.call_args_list[0].kwargs
(('replicate/flan-t5-xl:7a216605843d87f5426a10d2cc6940485a232336ed04d655ef86b91e020e9210',), {'input': {'prompt': 'say hi'}})
https://replicate.com/a16z-infra/llama13b-v2-chat is LLaMA v2.
Outputs:
The Replicate docs say you should structure the prompt like this to get proper chat behaviour:
Need a way to do this with
llm-replicate
- similar to howllm-gpt4all
does it: https://github.com/simonw/llm-gpt4all/blob/01d8ccf0dadeb934fbee9f3d647d4bcd8bb0ad1f/llm_gpt4all.py#L85-L91