ReaLLMASIC / nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.
MIT License
23 stars 17 forks source link

Softmax parameter sweep #167

Closed karthik-sunil closed 3 months ago

karthik-sunil commented 3 months ago

Adding config files to sweep parameters for SoftMax variations.

gkielian commented 3 months ago

Looks like I might have removed the dim in the last softmax pass, will try to correct after merging with another pr to fix the two tests above.