-
Hi there, so I am loading a finetuned Llama 2 13b model, and I get this error.
Here's part of the error:
File /usr/local/lib/python3.10/dist-packages/unsloth/models/loader.py:172, in FastLanguag…
-
Thank you for sharing the code.
I'm confusing something, I would appreciate if my understanding is correct.
1. Are you using the all heads output for the analysis?
The paper you mentioned 'Roles …
-
@jerryzh168 I think this could be beneficial to be able to load a quantized and compiled model and proceed straight to inference.
However, I am not sure what functions to use to make this happen. …
-
I reinstall `pip install flash-attn==2.6.1` in NGC pytorch docker image 24.06.
When I run train job, I got follow error:
```
Traceback (most recent call last):
File "/data1/nfs15/nfs/bigdata/zha…
-
### 🚀 The feature, motivation and pitch
I am trying to extract hidden states from the final layer of llama3-8b (i.e., the final batch_size, seq_length, n_emb vector _before_ computing the logits). Wo…
-
Hi!
I have found your work much interesting and inspiring ever since the first VAR release. However, it would be nice for such a project to implement much-used image-conditional generation in the m…
-
Install `diffusers` first.
And then do:
```python
from diffusers import DiffusionPipeline
from optimum.quanto import quantize, freeze, qint4
import torch
ckpt_id = "ptx0/pixart-900m-1024…
-
Building upon the deliverables outlined in [issue #19](https://github.com/ibis-project/ibisml/issues/19), the objective is to enhance the coverage of ibisml machine learning preprocessing transformati…
-
An error occurred when I tried to download transformer enginner following the official tutorial! (https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/installation.html)I have try some …
-
**Describe the bug**
I tried to quantize Qwen1.5-MoE-A2.7B-Chat with w4a16 for vllm PR: https://github.com/vllm-project/vllm/pull/7766
raise error TypeError: forward() got multiple values for argume…