mosaicml / llm-foundry

LLM training code for Databricks foundation models
https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm
Apache License 2.0
3.99k stars 525 forks source link

Modularize components of megablocks layer builder #1224

Closed dakinggg closed 4 months ago

dakinggg commented 4 months ago

Cleans up the sections of the builder functions for megablocks to be more modular, and modularizes the device mesh creation in the megablocks args.

TODO:

vchiley commented 4 months ago

Manual test that training still works

Don't we have tests for MoEs (MoE testing config here)? (I'm not against doing more testing, just asking)

dakinggg commented 4 months ago

@vchiley there are some simple MoE tests, I just wanted to double check e2e still works.

dakinggg commented 4 months ago

manual test done, merging