-
### What
Lets' support RoPE (Rotary Position Embedding) and fuse
The image below is part of the token generation model.
This graph is fused with rope op.
![rope](https://github.com/user-…
-
**Describe the bug**
I get an error while converting neox model to HF
the error that i get
`Traceback (most recent call last):
File "/media/sid/WDInternal/stability_ai/gpt-neox/tools/convert_…
-
### Additional context
[gpt-neox](https://github.com/EleutherAI/gpt-neox)
Is gpt-neox fully open source not better than OpenAI?
As soon as OpenAI transforms to paid or changes its API, your project…
-
Shared Data Repository page
```
books c4 cc_en_head cc_en_middle cc_en_tail peS2o stack-code wiki-en-simple
books_val c4_val cc_en_head_val cc_en_middle_val cc_en_tail_val …
-
![image](https://user-images.githubusercontent.com/5949853/225432184-384aa8f2-0da0-47b8-957c-ecc6e8acef6a.png)
NeoX 20B is larger than most non-professional environments can manage. The reduce meas…
-
Is it possible to use LoRA to fine tune GPT NeoX 20B?
-
# GPT-NeoX
Official pytorch: https://github.com/EleutherAI/gpt-neox
Unofficial jax: https://github.com/kingoflolz/mesh-transformer-jax
-
With the increasing interest in using this library to train models originally trained by others (https://github.com/EleutherAI/gpt-neox/issues/896 https://github.com/EleutherAI/gpt-neox/issues/994 htt…
-
Hi,
GPT/GPT-J/GPT-Neox have similar nn architecures. In my view, the implementations of them in `src/fastertransformer/models` (`multi_gpu_gpt`, `gptj`,`gptneox`) are also very similar. I am wonde…
-
Hi, I found that the init method of parameters in pythia-6.9B model is inconsistent with the standard deviation of the [step0 checkpoint](https://huggingface.co/EleutherAI/pythia-6.9b/tree/step0). Tab…