Closed vchiley closed 1 year ago
This PR
to(*args)
FusedExpertsNetwork
self.fc1_weight
self.fc2_weight
self.fc1_bias
self.fc2_bias
p = list[experts.parameters()][0]
Nice fixes, Thanks so much!
This PR
to(*args)
fn fromFusedExpertsNetwork
self.fc1_weight
,self.fc2_weight
,self.fc1_bias
, orself.fc2_bias
.to(*args)
fn is a bug.p = list[experts.parameters()][0]
will have dtype fp32 if using generic torch autocast; adding autocast to x if autocast enabled.