-
Not able to load microsoft/Phi-3-mini-128k-instruct.
Code snippet
```model_name = "microsoft/Phi-3-mini-128k-instruct"
def loadTransformerLensModel(modelPath):
tokenizer = AutoTokenizer.fr…
-
**The bug**
A clear and concise description of what the bug is.
The issue seems to be with [this line](https://github.com/guidance-ai/guidance/blob/be2be221ee6ee7cda3606ddd6f93544b947507c6/guidanc…
-
### Describe the issue as clearly as possible:
I defined a JSON schema through Pydantic, along with an accompanying prompt, following the chat format for Phi-3 mini.
I then generate an output from t…
-
Phi-3-mini-128k-instruct has the same number of parameters and same architecture as Phi-3-mini-4k-instruct, unless I am mistaken. Would it be possible for unsloth to support inference for this model a…
-
Not the most powerful, but a useful model:
https://huggingface.co/microsoft/Phi-3-mini-128k-instruct
-
Hello!
First of all, I'm impressed by this project and I hope it will pick up some steam in the nearby future. But I would really like to see additional LLMs/SLMs, as local models get faster and mor…
-
### System Info
GPU: RTX4090
Run 2.1.0 with docker like:
`docker run -it --rm --gpus all --ipc=host -p 8080:80 -v /home/jp/.cache/data:/data ghcr.io/huggingface/text-generation-inference:2.1.0 …
jphme updated
2 weeks ago
-
### Your current environment
PyTorch version: 2.3.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Ubuntu 22.04.1 LTS (x86_64)
GCC version: (U…
-
Let's allow developers to register a new LLM in a web browser as a web extension, which then would be able to be chosen in #8. The model would be in a TFLite [FlatBuffers](https://flatbuffers.dev/) fo…
-
Seems like a good match for WebLLM, as it was practicaly designed to run in the browser.
From this reddit thread:
https://www.reddit.com/r/LocalLLaMA/comments/1d2o445/comment/l63cvxk/