Open jvendrow opened 2 months ago
The pythia 12b config has:
"attention-config": [[["flash"], 40]],
However, in the gpt-neox repo the 40 is replaced by 36, and in the file:
https://huggingface.co/EleutherAI/neox-ckpt-pythia-12b-v1/blob/main/12B.yml
This value of 36. Is this a mistake? Also, the attention config line at:
https://github.com/EleutherAI/pythia/blob/1ff5ade2cefa71bd95d85491251574b787c00a8a/models/12B/pythia-12b.yml#L19
seems to be missing a comma?
Edit: The num-layers value also seems off, 36 v. 40.
num-layers
The pythia 12b config has:
"attention-config": [[["flash"], 40]],
However, in the gpt-neox repo the 40 is replaced by 36, and in the file:
https://huggingface.co/EleutherAI/neox-ckpt-pythia-12b-v1/blob/main/12B.yml
This value of 36. Is this a mistake? Also, the attention config line at:
https://github.com/EleutherAI/pythia/blob/1ff5ade2cefa71bd95d85491251574b787c00a8a/models/12B/pythia-12b.yml#L19
seems to be missing a comma?
Edit: The
num-layers
value also seems off, 36 v. 40.