Open zhujiem opened 1 year ago
Could you please share your evaluation code for monarch? Thank you!
https://gist.github.com/justheuristic/9e4fb81381451a4bc8cbfee0a5100eba
I reuse the code from this script. Just change from pixelfly import PixelflyLinear
to MonarchLinear, which is the offical implementation.
Hi @zhujiem , May I ask if you were able to run the training script? And if you were able to, what does your environment look like?
Here I post some efficiency testing numbers for Monarch based MLP v.s. vanilla nn.Linear based MLP. I found that Monarch is best suitable for MLPs in Transformer architectures, which generally have large hidden size and batch size. In recommendation-focused MLPs, the MLP is usually small (e.g., 10000x1024x512, the first is feature input dim) and importantly a small batch size (say 10) is often used for serving given concurrent online requests. The following testing numbers are provided as a reference for anyone who has similar tasks.
I will post the numbers for pixelfly later.