Hi,
Why exclude down_proj when executing a let? like model.mlp.down_proj.temp_weight = model.mlp.down_proj.weight.
I think the down_proj can be smoothed with up_proj, like smooth_fc_fc_temporary(model.mlp.up_proj,model.mlp.down_proj, model.down_smooth_scale, model.down_smooth_shift)
Am I right?
Hi, Why exclude down_proj when executing a let? like
model.mlp.down_proj.temp_weight = model.mlp.down_proj.weight
. I think the down_proj can be smoothed with up_proj, likesmooth_fc_fc_temporary(model.mlp.up_proj,model.mlp.down_proj, model.down_smooth_scale, model.down_smooth_shift)
Am I right?