mead-ml / mead-baseline

Deep-Learning Model Exploration and Development for NLP
Apache License 2.0
243 stars 73 forks source link

refactor transformer layer -> easier understanding #924

Closed dpressel closed 2 years ago

dpressel commented 2 years ago

Right now we try to handle all paths through the same path. This is error prone and difficult to reading. Compounding this, our pre-LN models to date are non-std, which means its even harder to get whats going on.

This refactoring pulls up an ABC for both, and provides separate implementations for all 3 that are supported preLN, postLN, TPT style