-
Greetings Wenjie,
I was very much impressed aby your work "SAITS". I am trying to create an attention-based model on my own as a part of my Bacholer's project and I have a few questions to ask:
I wa…
-
感谢贵团队的贡献!
最近在使用xtuner训练,但是遇到了一些问题。
1、一些参数的含义不是很清楚,有没有针对每个参数的说明文件呢?
2、我进行sft,但是跑起来后step和我手动算的对不是,config:
```
# Copyright (c) OpenMMLab. All rights reserved.
from peft import LoraConfig
f…
-
### System Info
- `transformers` version: 4.41.2
- Platform: Linux-6.5.0-27-generic-x86_64-with-glibc2.35
- Python version: 3.10.12
- Huggingface_hub version: 0.23.4
- Safetensors version: 0.4.…
-
### System Info
OS version: WSL 2. ubuntu 22.04
model: llama3-8B-Instruct
Hardware: no GPU
There is no gpu, but I installed the nvcc library in wsl using this command. `sudo apt install nvidia…
-
In the paper, the ablation study about attention emb and gen is interesting.
Are these models all different models using each attention?
Can I select causal attention for both cases when using G…
-
Following the paged attention [paper](https://arxiv.org/pdf/2309.06180), add cuda kernels for the Llama model. Cuda kernels for the Llama architecture have been widely implemented in the open source c…
-
### Describe the bug
duplicate assignment for hidden_states at models/attention_processor.py:1142
### Reproduction
N/A
### Logs
_No response_
### System Info
N/A
### Who can help?
_No respons…
-
用的fish-speech 1.1的官方镜像。
没有微调lora,代码块内容:
```shell
#将xxxxxx.ckpt改为你想要推理的模型
#可能需要等一两分钟分钟
%cd ~/autodl-tmp/workdir/fish-speech
!python -m tools.webui \
--llama-config-name dual_ar_2_codebook_la…
-
hi, I want to use the examples/pytorch/language-modeling/run_clm.py to train my model. But I find that the only way to use flash_attention is to modify the code in run_clm.py like:
```python
…
-
Tried (ubuntu) to torch.save (1.1.0) model using Linear Attention (0.4.0) and got the following serialization error:
`PicklingError: Can't pickle : attribute lookup on fast_transformers.feature_maps…