Open Baibaifan opened 4 months ago
this is a minor issue. it won't affect the correctness. If you want to load weights, just need to make sure the layout is in the view format
@ethanhe42 Thank you for your answer. I have made some modifications to the loading scene. I understand that using view
is to have better continuous memory usage efficiency. If split
or chunk
is used, it will return [tensor1, tensor2]. Are there any CPU scheduling issues?
Marking as stale. No activity in 60 days.
Describe the bug
As shown in the figure above, when calculating
w1
in this part, usingview
will cause element confusion. As shown in the figure above, it is wrong to useview
.Split
orchunk
should be used for conversion. Becausew2
does not have the dimensionality offfn_hidden_size
, there is no problem usingview
.test codes
To Reproduce
Expected behavior Determine if there is a problem with
GroupedMLP
.Stack trace/logs
Environment (please complete the following information):
Proposed fix
Additional context