-
Hi. Is it possible to add version 128k of phi-3 in ollama?Thanks in advance.
-
There could be more than one LLM in a web browser (built-in or added as a web extension). Let's show users the list of available LLMs (using their IDs) and allow them to optionally choose a model when…
-
All of the models supported by torchtune currently have rather low context lengths (
-
## 🚀 Feature
[Feature Request] phi-3 small released -> performs two times better then Phi-3 mini
https://huggingface.co/microsoft/Phi-3-small-128k-instruct
## Motivation
Phi-3 **small** just…
-
I am interested in running the `mlx-community/Phi-3-mini-128k-instruct-4bit` model with swift, but it cannot be loaded. Here is the output I am seeing:
```
➜ mlx-swift-examples git:(main) ./mlx-run…
-
The python `mlx_lm` implementation generates at ~101 tokens per second for `mlx-community/Phi-3-mini-4k-instruct-4bit`, whereas the swift code here generates at ~60 tokens per second.
Here is my py…
-
### Your current environment
I'm not able to run `collect_env.py` on this workstation
vllm == 0.5.1
vllm-flash-attn == 2.5.9
torch == 2.3.0
Tested on a single A100-80GB
The following mes…
-
### Is your feature request related to a problem? Please describe.
Could you please support microsoft/Phi-3-medium-128k-instruct? Thank you!
I tried to use `MInference("minference", "microsoft/Phi…
-
Looking at how efficient Phi-3-mini is for its size, one might argue that Phi-3-medium's function calling could be somewhere between llama-3-8B and llama-3-70B with your fine-tune?
-
### Model introduction
This model is created by Microsoft as an efficient tiny model based on heavily filtered web data and synthetically created training data
### Model URL
https://huggingface.co/…