xrsrke / pipegoose

Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
MIT License
76 stars 17 forks source link

Mod map #63

Open yugen-ok opened 8 months ago

yugen-ok commented 8 months ago

Updated the pipegoose/nn/parallel_mapping.py/ParallelMapping class to check module properties in accordance with https://github.com/xrsrke/pipegoose/issues/40.

This is designed to work with the bigscience/bloom-560m model. Adjusting it to other models should be a minor change, and amounts to changing what the node_target should satisfy. The check is based on node_target as it is a string that represents a submodule or layer.

We iterate over modules using .named_modules, so I have added a function to find node_target from a module so we can check their properties. I have also added a function to find the module from a node target in case the conversion is needed.