Open kevinthedang opened 3 months ago
Notes:
curl http://localhost:11434/api/pull -d '{
"name": "llama3"
}'
Whether or not stream is used, it will eventually send a
{
"status": "success"
}
curl http://localhost:11434/api/tags
One response is generated
{
"models": [
{
"name": "codellama:13b",
"modified_at": "2023-11-04T14:56:49.277302595-07:00",
"size": 7365960935,
"digest": "9f438cb9cd581fc025612d27f7c1a6669ff83a8bb0ed86c94fcf4c5440555697",
"details": {
"format": "gguf",
"family": "llama",
"families": null,
"parameter_size": "13B",
"quantization_level": "Q4_0"
}
},
{
"name": "llama3:latest",
"modified_at": "2023-12-07T09:32:18.757212583-08:00",
"size": 3825819519,
"digest": "fe938a131f40e6f6d40083c9f0f430a515233eb2edaa6d72eb85c50d64f2300e",
"details": {
"format": "gguf",
"family": "llama",
"families": null,
"parameter_size": "7B",
"quantization_level": "Q4_0"
}
}
]
}
curl http://localhost:11434/api/show -d '{
"name": "llama3"
}'
One response is given
{
"modelfile": "# Modelfile generated by \"ollama show\"\n# To build a new Modelfile based on this one, replace the FROM line with:\n# FROM llava:latest\n\nFROM /Users/matt/.ollama/models/blobs/sha256:200765e1283640ffbd013184bf496e261032fa75b99498a9613be4e94d63ad52\nTEMPLATE \"\"\"{{ .System }}\nUSER: {{ .Prompt }}\nASSISTANT: \"\"\"\nPARAMETER num_ctx 4096\nPARAMETER stop \"\u003c/s\u003e\"\nPARAMETER stop \"USER:\"\nPARAMETER stop \"ASSISTANT:\"",
"parameters": "num_ctx 4096\nstop \u003c/s\u003e\nstop USER:\nstop ASSISTANT:",
"template": "{{ .System }}\nUSER: {{ .Prompt }}\nASSISTANT: ",
"details": {
"format": "gguf",
"family": "llama",
"families": ["llama", "clip"],
"parameter_size": "7B",
"quantization_level": "Q4_0"
}
}
Now we just need to look into the Discord.js commands for these as well (if any) @JT2M0L3Y
This will impact the potential use of personally contextualized models like what has been idealized in #22 and/or #45.
How does this impact it much? We can create a command to create context for a LLM that a user wants to produce. These commands are to help reduce overhead from the user by entering a Ollama container and manually listing and pulling open source models.
This also allows users to view open source models to create their own LLMs from. Get the idea?
@JT2M0L3Y
So, should the list be of models already pulled or any model available through ollama (that is, if we're auto-pulling for model not already accessible in the container)?
My worry about the context: the current environment would require user-contextualized models to be pushed up to Ollama's model bank, right?
To my knowledge, I believe they do not have to be pushed to the Model Library to be used. I believe you can just create them.
MODEL
environment variable as mentioned in #45.Modelfiles
for Ollama to utilize when choosing a case specific LLM for a prompt. (Essentially a folder of Modelfile
files to run other LLMs, kinda weird and might be too much)Crazy Idea: Generating a Modelfile
on the fly for a Prompt (will likely introduce too much overhead).
For reference, listing local models is already possible but "listing all models available with Ollama" has a number of open issues in the Ollama repository itself:
Hmm alright. We'll do what we can for now then.
This feature will likely just be open until some kind of relevant API feature is implemented to deliver a .json
or simpler way to read off the models found.
Is there any issue with pulling existing Models from the Library? If not, we can implement that first and leave this open as long as necessary.
@JT2M0L3Y
The most promising solution I can find at the moment is a kaggle dataset updated in the past month that has 87 different models. But, an API endpoint would be preferrable.
I think for now, it would be possible to query a local set of models to check that what was requested exists within the environment.
As of now, it looks like the Kaggle dataset created to resolve issue #1473 in the ollama repo is expected to be updated daily as new models are added.
As far as progress on an API endpoint for this, it looks like there is plenty of community desire for this feature but not too much progress on the implementation of such a feature.
We may have to wait awhile for this to be solved.
Issue
MODEL
environment variable, we can allow for storage of what models are present in the container.--rm
case, or we could just not! Either way, default to removing the models when the container dies should be good as ollama always will pull the latest of a model. So it should be fine as long as it can pull from the "Ollama Model Library"Solution
MODEL
environment variable and instead just have some kind of way to store what models exists and remove them upon collapse of the container.discord
thenollama
container. Change that to the opposite so the ollama container can be ready prior to the bot.Other Images
discord
andollama
container with no model set up.