-
I downloaded the llama-2-7b and run the command as they metioned
```
torchrun --nproc_per_node 1 example_text_completion.py \
--ckpt_dir llama-2-7b/ \
--tokenizer_path tokenizer.model \
…
-
https://github.com/huggingface/distil-whisper
-
![image](https://github.com/facebookresearch/llama/assets/82858160/9320de97-54fd-4bb5-b82f-c08e96c64b87)
when i running `torchrun --nproc_per_node 1 example_chat_completion.py --ckpt_dir Llama-2-7b-c…
-
##are these code used to roll in forward_flashattn
x = rearrange(qkv, "b s three h d -> b s (three h d)")
x_unpad, indices, cu_q_lens, max_s = unpad_input(x, key_padding_mask)
cu_q_len_tmp = torch.…
-
I believe there is a difference in relative position implemented here, and what is described in the paper. The issue I see is in [theta_shift and rotate_every_two](https://github.com/microsoft/torchs…
-
As of 2023-08-13, a quoting error in the upstream CSV causes AIPP to fail:
```
ERROR: Illegal quoting in line 488.
```
The offending line:
```csv
BSZ;"4203.080";20230822;0700;2000;"A - F";…
svoop updated
9 months ago
-
Hi, I got the nan issue (as #136), even reducing learning rate to `1e-5`, after ~6000 iters. I'm not sure if this is caused by the dtype `float16` I used in my config. Any ideas why this is happening?…
-
![1698998777982](https://github.com/showlab/UniVTG/assets/36877347/169687c6-8fbb-433d-91e6-636d4231360a)
Thanks for your work, some training details are not too clearly described in the readme, so I …
-
### System Info
- `transformers` version: 4.35.2
- Platform: Linux-6.1.58+-x86_64-with-glibc2.35
- Python version: 3.10.12
- Huggingface_hub version: 0.19.4
- Safetensors version: 0.4.1
- Acc…
-
Dear Authors and @yukang2017 ,
Thanks for the amazing work. I am trying to understand the following:
https://github.com/dvlab-research/LongLoRA/blob/2a33f37543038877c70e9a625a61dc72a71621d0/llama_…