ggerganov / llama.cpp

LLM inference in C/C++
MIT License
65.38k stars 9.37k forks source link

Phi provides empty response #4527

Closed jmorganca closed 9 months ago

jmorganca commented 9 months ago

I'm not sure if this is a problem with the model weights, but when running Phi on master, converted from https://huggingface.co/microsoft/phi-2

Current Behavior

./main -m ../phi-2/fp16.bin -i -ngl 1
...

== Running in interactive mode. ==
 - Press Ctrl+C to interject at any time.
 - Press Return to return control to LLaMa.
 - To return control without starting a new line, end your input with '/'.
 - If you want to submit another line, end your input with '\'.

Instruct: Why is the sky blue? 
Output: 

Environment and Context

jmorganca commented 9 months ago

Also, just wanted to say a huge thanks for adding Phi in https://github.com/ggerganov/llama.cpp/pull/4490 ! Excited to get a smaller parameter count model into folks hands 😃

ggerganov commented 9 months ago

I can't reproduce the issue:

make -j main && ./main -m models/phi-2/ggml-model-f16.gguf -i -ngl 1

...

Instruct: Why is the sky blue?  
Output:
The sky appears blue because of a phenomenon called Rayleigh scattering. The Earth's atmosphere scatters short-wavelength light (blue and violet) more than long-wavelength light (red and orange). Therefore, most of the sunlight that reaches our eyes from all directions gets scattered in all directions by the molecules and dust particles in the air. However, our eyes are more sensitive to blue light than red light, so we perceive the sky as blue.

llama_print_timings:        load time =     271.83 ms
llama_print_timings:      sample time =      10.16 ms /    93 runs   (    0.11 ms per token,  9157.15 tokens per second)
llama_print_timings: prompt eval time =     270.49 ms /    12 tokens (   22.54 ms per token,    44.36 tokens per second)
llama_print_timings:        eval time =    1235.21 ms /    93 runs   (   13.28 ms per token,    75.29 tokens per second)
llama_print_timings:       total time =   15470.38 ms
jmorganca commented 9 months ago

Thanks for taking a look at this so quickly. I think it's because I was using the wrong prompt template (I was adding a space space after the : in Output:. Reminder of how much each character in the prompt matters 😊.

I'll close this for now and re-open if it comes up again. Thanks so much @ggerganov

ggerganov commented 9 months ago

Ah yes, adding a space after Output: makes it terminate immediately. Interesting

jmorganca commented 9 months ago

FWIW I noticed a similar thing with other tools. E.g. with https://github.com/ml-explore/mlx, which makes me believe it's probably an "issue" with the model itself

% python phi2.py --prompt 'Instruct: Why is the sky blue? 
quote> Output: '
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
[INFO] Generating with Phi-2...
Instruct: Why is the sky blue?
Output: