bigcode-project / Megatron-LM

Ongoing research training transformer models at scale
Other
376 stars 49 forks source link

Experiment plan #16

Closed RaymondLi0 closed 1 year ago

RaymondLi0 commented 1 year ago

Proposed list of experiments to run:

https://docs.google.com/spreadsheets/d/1xOIYoExQP_haA80ArY09fAk49fNVTO8eJrYysw71FQs/edit?usp=sharing

This list was created using this notebook: https://github.com/bigcode-project/Megatron-LM/blob/raymond-notebooks/notebooks/transformer_parameter_count.ipynb Still open questions:

RaymondLi0 commented 1 year ago

Some facts to decide which dataset to train on:

After additional near-dedup, we have:

Some of the evaluations that may be relevant: