ggerganov / llama.cpp

LLM inference in C/C++
MIT License
62.52k stars 8.97k forks source link

[User] How can instructions be sent without entering interactive mode? #1689

Closed BrLlan closed 1 year ago

BrLlan commented 1 year ago

I would like a script to pass a a single instruction and receive an answer.

ghost commented 1 year ago

I would like a script to pass a a single instruction and receive an answer.

I'm trying to understand your question. It sounds like you simply want to prompt a language model with a single instruction.

In this case I'd use -p in ./main to specify the instruction. For example;

./main -m ~/llama.cpp/models/Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin -b 10 -p "What is the sum of 1+1?"
x4080 commented 1 year ago

@JackJollimore is it possible to get the answer from another terminal app like dotnet ? to get the result from llama cpp ?

ghost commented 1 year ago

@JackJollimore is it possible to get the answer from another terminal app like dotnet ? to get the result from llama cpp ?

Essentially, you want the results of llama.cpp printed to another terminal (dotnet) - It sounds viable, but I haven't done it myself. It probably involves the llama.cpp api, or possibly a server.

Hopefully, there's someone available that knows for sure and will respond.

x4080 commented 1 year ago

@JackJollimore thanks - I think about server also but the server is not designed to open and close the llama cpp, so I thought calling directly the llama cpp can get the result from the caller

BrLlan commented 1 year ago

I would like a script to pass a a single instruction and receive an answer.

I'm trying to understand your question. It sounds like you simply want to prompt a language model with a single instruction.

In this case I'd use -p in ./main to specify the instruction. For example;

./main -m ~/llama.cpp/models/Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin -b 10 -p "What is the sum of 1+1?"

Thanks for answering! :) Unless I did something very wrong than any such prompt will not be treated as in instruction, but llama.cpp will 'complete the sentence' and just add more words.

It's answer:

What is the sum of 1+1? What is the sum of 2+2? If you are going to ask why, then stop reading now. Otherwise, keep reading as the answers to these questions might not be so obvious. The answer in both cases: 3. Now that you’ve read this far it probably becomes clear what I am on about… My question above wasn’t some cunning riddle or a trick to catch out those who lacked the mathematical prowess to spot the solution. The sum of any two even numbers is equal to, yep you guessed it, 3!

The way I understand it to access the instruction tuned version I need the -ins flag, which automatically opens interactive mode. I would like to sent a single instruction and process the answer further, for now.

ghost commented 1 year ago

I would like a script to pass a a single instruction and receive an answer.

I'm trying to understand your question. It sounds like you simply want to prompt a language model with a single instruction. In this case I'd use -p in ./main to specify the instruction. For example;

./main -m ~/llama.cpp/models/Wizard-Vicuna-7B-Uncensored.ggmlv3.q4_0.bin -b 10 -p "What is the sum of 1+1?"

Thanks for answering! :) Unless I did something very wrong than any such prompt will not be treated as in instruction, but llama.cpp will 'complete the sentence' and just add more words.

It's answer:

What is the sum of 1+1? What is the sum of 2+2? If you are going to ask why, then stop reading now. Otherwise, keep reading as the answers to these questions might not be so obvious. The answer in both cases: 3. Now that you’ve read this far it probably becomes clear what I am on about… My question above wasn’t some cunning riddle or a trick to catch out those who lacked the mathematical prowess to spot the solution. The sum of any two even numbers is equal to, yep you guessed it, 3!

The way I understand it to access the instruction tuned version I need the -ins flag, which automatically opens interactive mode. I would like to sent a single instruction and process the answer further, for now.

I'm trying to understand, but I'm having difficulty.

It sounds like what you want is a single, succinct response to your instruction. The way a model responds depends the model, the prompt template, and the words used to instruct it.

With the -p parameter, Llama.cpp prompts the language model without entering interactive mode. Include the -ins parameter if you need to interact with the response.

If I need a single, succinct response then I'd prompt an instruction-based model, like WizardLM, by adding to the -p parameter in the main prompt. For example..

-p "### Instruction: Provide an accurate, single and succinct response to the sum of 1+1. ### Response:"

You might try adding the word "Consise", "Ensure your response is brief", and "Respond with a single, articulate sentence that completes the request", to your instruction. For example..

-p "### Instruction: Provide a consise, accurate, and succinct response to the sum of 1+1. Respond with a single, articulate sentence that completes the request and Ensure your response is brief. ### Response:"

Please clarify if I'm misinterpreting your message.

ghost commented 1 year ago

@JackJollimore thanks - I think about server also but the server is not designed to open and close the llama cpp, so I thought calling directly the llama cpp can get the result from the caller

I'd love to help you if I knew the answer, but I don't, so I suggest checking discussion, or opening a new issue so that others who are more informed can better help.

BrLlan commented 1 year ago

Sorry for late reply, several things broke.

To clarify: I have a small server that takes post requests and forwards them as inputs to the command line. When I want to pass an instruction to llama.cpp in this manner it does not work, because the instruction mode is automatically also interactive. You can't just invoke the main with the instruction once and get a reply, you have to interact with the command line continuously. This not the behavior that would be useful for me.

So is there a way to enter an instruction, over the command line, without using interactive mode? Just sent one instruction, get one reply?

ghost commented 1 year ago

To clarify: I have a small server that takes post requests and forwards them as inputs to the command line. When I want to pass an instruction to llama.cpp in this manner it does not work, because the instruction mode is automatically also interactive. You can't just invoke the main with the instruction once and get a reply, you have to interact with the command line continuously.

Be default llama.cpp ./main runs without interactive mode.

An instruction, or prompt generation may be sent to llama.cpp with -p parameter.

-p PROMPT, --prompt PROMPT                        
prompt to start generation with (default: empty)

This parameter will not be interactive unless including either -i, or -ins parameters. Ensure to exclude -i and -ins from ./main.

BrLlan commented 1 year ago

I simply accepted interactive mode for now, my use case could work with it. I enter a request and get the result back and just end the interactive mode. Closing the issue, thanks for the answers!

liangzid commented 11 months ago

@BrLlan Hey, I have the similar problem. So now you just simply cat the results by a shell script?