-
Need to modify [Gemma model implementation](https://github.com/EricLBuehler/mistral.rs/blob/master/mistralrs-core/src/models/gemma.rs) with:
## Changelist over original Gemma and status:
- [x] S…
-
### Do you need to file an issue?
- [ ] I have searched the existing issues and this bug is not already filed.
- [ ] My model is hosted on OpenAI or Azure. If not, please look at the "model providers…
-
## 🐛 Bug
## To Reproduce
Steps to reproduce the behavior:
1. I convert model weights and package libraries and weights on my macOS
1. So my mlc-package-config.json in ./android/MLCChat i…
-
### Your current environment
```text
PyTorch version: 2.3.1+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: RED OS release MUROM (7.3.4) Stan…
vlsav updated
1 month ago
-
### What happened?
### Goal
I want to use a custom hosted OpenAI LLM model instead of the one hosted by OpenAI.
Therefore I want to change OPENAI_API_BASE_URL.
### What happened
Using `fabric -…
-
**Issue Description**:
The endpoint `http://localhost:8000/v1/chat/completions` is failing to return a proper response when integrated with third-party AI chat frontends such as Chatbox and OpenCat…
-
### Before submitting your bug report
- [ ] I believe this is a bug. I'll try to join the [Continue Discord](https://discord.gg/NWtdYexhMs) for questions
- [X] I'm not able to find an [open issue](ht…
-
### What is the issue?
I followed the below document to run the ollama model in GPU using Intel IPEX
https://github.com/intel-analytics/ipex-llm/blob/main/docs/mddocs/Quickstart/ollama_quickstart.…
-
**Describe the bug**
ValueError: XFormers does not support attention logits soft capping.
**Full Error log**
{
"name": "ValueError",
"message": "XFormers does not support attention lo…
-
gemma 2 cant be downloaded in alpaca where as i can download it in ollama directly