-
llava-phi-3-mini uses the Phi-3-instruct chat template. I think is similar with current llava-1-5, but with Phi3 instruct template instead of llama 2.
format:
`\nQuestion \n`
stop word is
for…
-
This is when you usually will lose any type of verbose response in inferencing
`torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 15.33 GiB. GPU 0 has a total capacity of 24.00 GiB of w…
-
Hi IPEX-LLM Team.
We testing the OLLAMA follow this guide:
https://github.com/intel-analytics/ipex-llm/blob/main/docs/mddocs/Quickstart/ollama_quickstart.md
The platform that we using is MTL iG…
-
When I try to run short prompts (up to ~200 tokens) everything works well, however, if I increase the number of tokens in the input I get the following error:
```
Output: 2024-06-03 08:12:09.7100776…
-
What all add ons I would need to make it work with Xavier NX?
-
Ref: https://github.com/ggerganov/llama.cpp/pull/8687#issuecomment-2252155218
(cc @ggerganov)
TODO:
- Train some adapters based on stories15M and [stories15M_MOE](https://huggingface.co/ngxson/…
-
**Is your feature request related to a problem? Please describe.**
When I send request to original ollama service, like this:
```
POST http://127.0.0.1:11434/api/chat HTTP/1.1
Cache-Control: no-ca…
-
### Before submitting your bug report
- [X] I believe this is a bug. I'll try to join the [Continue Discord](https://discord.gg/NWtdYexhMs) for questions
- [X] I'm not able to find an [open issue](ht…
-
Lots of people have asked to have a local version working that is not reliant on OpenAI.
So far OSS models have seemed to not be good enough but [Phi-3](https://huggingface.co/microsoft/Phi-3-visio…
-
#### Is your feature request related to a problem? Please describe.
We have a few downloaded Hugging Face models and we would like to use PyRIT for AI red teaming. I didn;t find any Prompt Target f…