bsz Search Results - Githubissues

1000+ results
for bsz

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

lm-sys/FastChat #2307

Flash Attention Monkey Patch not working with CodeLlama-34B

I think that the GQA in CodeLlama-34B is messing with the Flash Attention Monkey Patch. When training with the monkey patch, I get errors like: > File "/fsx/training/llama/flash_attn_monkey_pat…

michaelroyzen updated 1 year ago
3
dakenf/diffusers.js #13

SDTurbo - 'MultiHeadAttention_0' Failed to run JSEP kernel

Hi! Thank you for this great work. I'm trying to run [SDTurbo](https://huggingface.co/stabilityai/sd-turbo) with diffusers.js. I've followed the instructions from [this issue](https://github.…

cyrildiagne updated 7 months ago
11
dvlab-research/LongLoRA #65

关于llama_attn_replace_sft.py中forward_noflashattn的group_size是否…

您好，llama_attn_replace_sft.py中forward_noflashattn()方法中有加一行关于q_len和group_size关系的判断，如下： def forward_noflashattn( self, hidden_states: torch.Tensor, attention_mask: Optional[torch.Tensor…

Duandand updated 10 months ago
11
alecramsay/pg #55

Add a link to Balzer's dissertation

alecramsay updated 11 months ago
1
InternLM/InternLM #191

[Bug] sft后模型推理出现乱码

### 描述该错误在internlm-7B基础上用自己的sft数据训练，并转化为hf格式，加载模型进行推理，结果出现乱码 ![image](https://github.com/InternLM/InternLM/assets/44628671/470ab99a-ae95-48df-b706-54d456237955) ### 环境信息 ``` response, history …

killandy updated 1 year ago
4
meta-llama/llama #302

Is there lost the codes of token one-hot encoding?

In model.py, the `toekn` is pasted to `self.tok_embeddings = Linear(params.vocab_size, params.dim)` in `forward()` function. But in generation.py, `token` was defined as `tokens = torch.full((bsz, tot…

loongMin updated 1 year ago
1
OPUS4/application #869

Identifier-Typ für Verbund-ID hinzufügen

Da wir standardmäßig nach der Übertragung der Daten aus OPUS in die Verbund-Datenbank die Verbund-ID zurückspielen, würden wir dies gerne in ein Standard-Identifier-Feld speichern. Werte sind aktue…

j3nsch updated 9 months ago
11
meta-llama/llama #654

AssertionError: (6, 4) with example_chat_completion.py

Hello! I'm getting an error when running the `example_chat_completion.py` script. Any help is much appreciated. Thank you! ``` torchrun --nproc_per_node 1 example_chat_completion.py \ --ckpt_…

anmanikandan updated 1 year ago
10
whwu95/BIKE #11

CUDA out of memory

I am using 4 3090ti cards, and I have set the batch size to very small, but this situation occurs every time the first epoch is clicked Traceback (most recent call last): File "train.py", line…

youloseiwin updated 9 months ago
1
hako-mikan/sd-webui-regional-prompter #255

Potential incompatibility with SD.Next

**Describe the bug** SD.Next recently switched to an internal LoRA/LyCORIS handler. When using Regional Prompter in *Latent* mode, the following traceback is raised. ``` 12:44:57-273233 ERROR …

lbeltrame updated 9 months ago
4

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for bsz

1000+ results
for bsz