danielmiessler / fabric

fabric is an open-source framework for augmenting humans using AI. It provides a modular framework for solving specific problems using a crowdsourced set of AI prompts that can be used anywhere.
https://danielmiessler.com/p/fabric-origin-story
MIT License
18.23k stars 1.89k forks source link

[Bug]: System prompts are not being passed to local ollama instance #485

Open kristophercrawford opened 1 month ago

kristophercrawford commented 1 month ago

What happened?

I am running ollama locally on my M1 Mac and have the model llama3:8b specified as the default for fabric to use. When running the command pbpaste | fabric -sp label_and_rate after copying the contents of the webpage https://iximiuz.com/en/posts/thick-container-vulnerabilities/ to my clipboard, the response is not json formatted as I would expect.

kris@m1pro:~/fabric (main)$ pbpaste | fabric -sp label_and_rate
This is an article about container security, specifically the importance...

If I instead run pbpaste | ollama run llama3:8b $(cat patterns/label_and_rate/system.md) with the same webpage copied to my clipboard, I get the expected results in json format.

kris@m1pro:~/fabric (main)$ pbpaste | ollama run llama3:8b $(cat patterns/label_and_rate/system.md)
Here is the output in JSON format:

{
"one-sentence-summary": "Building container images from scratch can significantly reduce vulnerabilities.",
"labels": "Containers, Security, Docker, Vulnerability, CyberSecurity",

It seems that the system prompt is not being passed to ollama.

Version check

Relevant log output

No response

Relevant screenshots (optional)

No response

untoldecay commented 1 month ago

Hey there,

I have the same issue, on almost every patterns I've tried (extract_wisdom, summarize...) using local llama3. I also tried this copying the text on the url you provided pbpaste | fabric --model llama3:latest -sp label_and_rate

And I have a different output (that still doesn't follow system prompt)

What a fascinating journey through the world of Docker images! You've successfully created three different Python images: one based on Debian 10, another on Python 3.9 slim, and a third on Alpine Linux. Each image has its own unique characteristics and vulnerabilities.

Here are some key takeaways from your experience:

1. **Debian-based image**: The original Python 3.9 image is quite large (114 MB) and contains many potential security flaws (69 vulnerabilities). By using the `FROM python:3.9-slim` instruction, you were able to reduce the size of the image by half while still maintaining most of its functionality.
2. **Python 3.9 slim**: The Python 3.9 slim image is even smaller than the Debian-based one (54 MB) and has fewer vulnerabilities (14 high severity and 8 medium severity). This is a great example of how using a slimmer base image can lead to a more secure and efficient container.
3. **Alpine Linux**: The Alpine Linux-based Python 3.9 image is an excellent choice for those looking for a lightweight and secure option. With zero known vulnerabilities, this image is a great choice for production environments.

You also explored the concept of "distroless" images, which are language-focused Docker images that minimize the operating system's presence. This approach can lead to smaller and more secure containers, as seen in your Python 3.9 distroless image example.

Finally, you touched on the idea of building container images from scratch using a language like Go. This approach eliminates the need for a base image altogether, resulting in an even more lightweight and secure container.

Overall, your experience highlights the importance of considering image size and security when designing and deploying containers. By choosing the right base image or creating your own distroless image, you can ensure that your containers are not only efficient but also secure and reliable.
rawpayload commented 1 month ago

Same here, the system prompt doesn't seem to be passed or taken into account by Llama (Mac M3)

barnesjr commented 1 month ago

Same here too. I've been trying to figure this out to when writing custom patterns and I just tried it with an out of box pattern and the same result as all of you.

AtheeSouhait commented 1 month ago

Hi, this is a lead but I can't test it for now:

each model needs a particular message structure, for example for Mistral models, it is like this: {System}[INST]{User}[/INST]{Assistant} for LLama3: <|start_header_id|>system<|end_header_id|> {System} <|eot_id|> <|start_header_id|>user<|end_header_id|> {User} <|eot_id|><|start_header_id|>assistant<|end_header_id|> {Assistant}

So as for local models, fabric sends the message in like this it doesn't format according to the model messages = [{"role": "system", "content": context}, user_message]

Maybe try to create a system.md file with your system prompt in the format required by your model, and format also your user message and test ... If it works, may be ask for a new feature that could be a "--message-format" argument to fabric in order to provide model specific format.

rawpayload commented 1 month ago

Thx, I was also thinking about that. But as Ollama is supposed to provide a kind of abstraction with its python library (same ways to request the model whatever the one that is chosen), I would naively think that it’s the responsibility of this Ollama lib to format system message from a simple text string to the appropriate format for the chosen model within Ollama.. but I didn’t checked yet the Ollama lib code to see how it is implemented…

WillJesperson commented 1 month ago

Ive run into the same issue while using my local model. I am however able to use a remote Ollama server. I think this gives credence to the idea that fabric is just not passing on the prompt to the local model, as a remote server would also have to request the file path and contents on the pattern from the client

dethos commented 4 weeks ago

I'm facing the same issue on Linux. Regardless of the model, the pattern doesn't seem to be used.

WillJesperson commented 4 weeks ago

If i make a --remoteOllamaServer call on the machine which is actually running the model then it works. But it refuses to run on the machine natively regardless of what OS I am using (I have tried WSL and Linux)