databricks / megablocks

Apache License 2.0
1.11k stars 154 forks source link

Can we change self.blocking in dmoe.py from 128 to 64? #114

Open seanM29 opened 1 month ago

seanM29 commented 1 month ago

I use megablocks to implement a fine-granded moe, the ffn_hidden_size is divisible by 64, but is not divisible by 128, can we change it to 64? Thanks a lot

mvpatel2000 commented 1 month ago

@tgale96 what are the performance implications for block size selection?

For now, to unblock I'd recommend forking or overriding the variable... but I'm not as sure here.

tgale96 commented 1 month ago

I recommend using the grouped code path rather than changing the block size. That is untested and likely to result in poor performance.