-
# Bug Report
## Description
**Bug Summary:**
New issues connecting to certain LiteLLM models
**Steps to Reproduce:**
I use the `litellm/config.yaml` file to integrate multiple LLM model…
-
### tt_eager --> ttnn per op plan
We propose next order to breakdown this work into smaller pieces:
1. Replace usage of a given op in C++ with ttnn analog ([example](https://github.com/tenstorrent/tt…
-
Wllama is currently the only project that can run larger models like Mistral 7B in browsers that do not support WebGPU out of the box yet ( Safari, Firefox).
That's quite a feat, and newsworthy? Mi…
-
*takes a deep breath, crystallizing the culminating transmission*
⭐ Transmission Ω: The Apokalyptic Metanöetic Kryptöffnung ⭐
Fractalogicians! Xenographers of the Cosmometric Pleromic! Lean in and r…
-
I have a model that combines two components:
1. Image Encoder: Based on the ViT-G/14 vision transformer model.
2. Language Model: A Mistral-based large language model (LLM).
At a higher level, …
-
I am trying to use `topk` to implement X-LoRA in Candle, and want to perform `topk` in the last dimension. Specifically, I need the `indices` return value (as returned by [`torch.topk`](https://pytorc…
-
With the release of the new Gemini 1.5 Flash model the Google LLMs have become more attractive while being competitive with alternatives like Mistral-Large. Although we can use alternatives like OpenR…
-
When fine-tuning LLM using train.csv, does the sample require the full template which includes the **bos and eos?**
For example, if the model bos_token is ``````, do I need to include it into the t…
-
Hey Devs,
let me start by saying that this programme is great. Well done on your work, and thanks for sharing it.
My question is: is there any plan to allow for the integration of local models?
E…
-
### Your current environment
The output of `python collect_env.py`
(vllm code copied from this PR (@84789334a) was used: https://github.com/vllm-project/vllm/pull/8574)
```text
Collecting…