ggerganov / llama.cpp

LLM inference in C/C++
MIT License
66.15k stars 9.51k forks source link

Final trailing LF stripped off prompt when using --file #1444

Closed rmc135 closed 6 months ago

rmc135 commented 1 year ago

Hello! When passing a prompt via --file, if there's a trailing LF at the very end of the file it appears to be ignored, and generation appends to the same line, rather than a new one.

Expected Behavior:

Given the following prompt:

A: One banana
B: Two apples
C: Three oranges
D: Four

Hex dump:

00000000  41 3a 20 4f 6e 65 20 62  61 6e 61 6e 61 0a 42 3a  |A: One banana.B:|
00000010  20 54 77 6f 20 61 70 70  6c 65 73 0a 43 3a 20 54  | Two apples.C: T|
00000020  68 72 65 65 20 6f 72 61  6e 67 65 73 0a 44 3a 20  |hree oranges.D: |
00000030  46 6f 75 72 0a                                    |Four.|

If there's a single trailing LF at the end of the file - as there is with every other line - you would expect generation to continue with "E:" on a new line, something like:

 A: One banana
B: Two apples
C: Three oranges
D: Four
E: Five grapes

Current Behavior

Instead, the generation ignores the LF and continues the last line as if it had not been terminated:

 A: One banana
B: Two apples
C: Three oranges
D: Four plums. [end of text]

To get the expected output beginning on a new line, you have to append a second LF (ie a blank line) to the end of the prompt file.

(Also of note is that the very first character of the displayed prompt/output is always a space. This happens regardless of whether the prompt is passed via cmdline, via file, or not at all. I don't know if this is a model thing, or a bug?)

Environment and Context

AMD Ryzen 5 5600G 128GB RAM Ubuntu 22.04.2 LTS (jammy) GNU Make 4.3 gcc version 11.3.0 (Ubuntu 11.3.0-1ubuntu1~22.04)

Steps to Reproduce

Sample command using LLaMA 7B; also happens with LLaMa 30B. promptfile.txt is the above text.

./main -m ~/llama/models/7B/ggml-model-f16.bin --temp 0.40 --file promptfile.txt
ejones commented 1 year ago

I agree this is surprising. I believe the newline stripping happens here when handling the --file argument. My impression is this was done to simplify chat-style prompts stored in files, where the last line is e.g., "User:", presumably because it's difficult to keep a trailing newline out of files when written with various tools/editors.

The space is prepended here and my understanding is that this is to match LLaMA's training. And then I guess it's just left in when printing the prompt back out.

github-actions[bot] commented 6 months ago

This issue was closed because it has been inactive for 14 days since being marked as stale.