Pythia 12b flash config - Githubissues

EleutherAI / pythia

The hub for EleutherAI's work on interpretability and learning dynamics

Apache License 2.0

2.16k stars 156 forks source link

Pythia 12b flash config #162

Open jvendrow opened 2 months ago

jvendrow commented 2 months ago

The pythia 12b config has:

"attention-config": [[["flash"], 40]],

However, in the gpt-neox repo the 40 is replaced by 36, and in the file:

https://huggingface.co/EleutherAI/neox-ckpt-pythia-12b-v1/blob/main/12B.yml

This value of 36. Is this a mistake? Also, the attention config line at:

https://github.com/EleutherAI/pythia/blob/1ff5ade2cefa71bd95d85491251574b787c00a8a/models/12B/pythia-12b.yml#L19

seems to be missing a comma?

Edit: The num-layers value also seems off, 36 v. 40.