shawntan / scattermoe

Triton-based implementation of Sparse Mixture of Experts.
Apache License 2.0
186 stars 14 forks source link

ParallelLinear with bias #8

Closed CanyonWind closed 7 months ago

CanyonWind commented 7 months ago

Hi, could you please share an example whether ParallelLinear can add bias term in there in addition to the weights? Thanks

shawntan commented 7 months ago

We haven't implemented this.

CanyonWind commented 7 months ago

got it thanks