Parameters are not shared in experts

Hi, from the paper I thought that the most important parameters are shared across different experts. However, in the code I did n't see how to ensure the parameters are the same in the training process. I see in utils.py, expert_list[i].fc1.weight.data = fc1_weight_data[idx, :].clone(), but the variable created by clone will not be the same as the old one. I also do experiments to check my assumption. After several steps, the parameters in experts are no longer the same. Can you give more highlights on that? Thanks.

SimiaoZuo / MoEBERT

Parameters are not shared in experts #6