-
### 🐛 Describe the bug
``` Python
from functools import lru_cache
from torch.nn.attention.flex_attention import flex_attention, create_block_mask
import torch
torch._dynamo.config.cache_s…
-
Hello :)
We used the default Unsloth Colab Pipeline to ft a LLAMA3.1 8B and replicated this as a notebook on an azure environment.
https://colab.research.google.com/drive/1Ys44kVvmeZtnICzWz0xgpRn…
-
Code to reproduce:
```python
import trl
from unsloth import FastLanguageModel
import torch
from tqdm import tqdm
from transformers import AutoTokenizer
from datasets import load_dataset
fr…
-
I'm encountering a KeyError when trying to train Phi-3 using the unsloth library. The error occurs during the generation step with model.generate. Below are the details of the code and the error trace…
-
In the Stable Diffusion Deep Dive notebook, in the code plot immediately following the Transformer diagram, there is the definition of `get_output_embeds` which includes a call to `text_encoder.text_m…
-
Hi Vik,
Thanks for all the help! And it works perfectly with `cuda` option. Wondering if you have seen this before while using `cpu`
The model is loaded by:
```
DEVICE = "cpu"
DTYPE = torch.f…
-
(This is running on a Nvidia 4090 GPU, with jax '0.4.31')
I had got that is something like the example below. Here, the depth-wise convolution wants the input to be transposed from [batch, sequence…
-
**LocalAI version:**
OK:
- `local-ai-avx2-Linux-x86_64-1.40.0`
- `local-ai-avx2-Linux-x86_64-2.0.0`
- `local-ai-avx2-Linux-x86_64-2.8.0`
- `local-ai-avx2-Linux-x86_64-2.8.2`
- `local…
-
**Summary**
I'm hitting a NaN loss issue when I use the TransformerLayer in place of a pytorch transformer layer I wrote.
**Details**
I'm using the nvcr.io/nvidia/pytorch:24.04-py3 docker cont…
-
### Question
Hello,
I have trained a LlavaMistralForCausalLM model based on openchat (**not moe version**), but when I use predict.py
I get the following error:
```
File ~/scripts/MoE-LLaVA/…