Experiment plan - Githubissues

bigcode-project / Megatron-LM

Ongoing research training transformer models at scale

Other

376 stars 49 forks source link

Experiment plan #16

Closed RaymondLi0 closed 1 year ago

RaymondLi0 commented 1 year ago

Proposed list of experiments to run:

https://docs.google.com/spreadsheets/d/1xOIYoExQP_haA80ArY09fAk49fNVTO8eJrYysw71FQs/edit?usp=sharing

This list was created using this notebook: https://github.com/bigcode-project/Megatron-LM/blob/raymond-notebooks/notebooks/transformer_parameter_count.ipynb Still open questions:

[ ] Which languages to train on? We could afford to do each experiment on single-language and multi-language datasets, doubling the compute.
[ ] Which evaluations. HumanEval, MBPP, repo-level eval? Some downstream tasks with finetuning? https://github.com/bigcode-project/bigcode-evaluation-harness/tree/main/finetuning/

RaymondLi0 commented 1 year ago

Some facts to decide which dataset to train on:

After additional near-dedup, we have:

Python: 62GB
Java: 88GB
JS: 65GB
Other languages from the Stack (excluding HTML and JSON): around 1.5TB

Some of the evaluations that may be relevant:

HumanEval, MBPP (Python, Java, JS)
APPS (Python)
Repo-level eval (any language?)
PAL (Python?)
Helm Reasoning tasks
Downstream classification tasks - finetuning (Java)