Closed root116688 closed 1 year ago
Hi Thanks for your interest.
For Q1 and Q2, they are mathmatically equal. For Q3, we didn't use groups and the parameter should always be 1. We implement it for additional experiments.
Thanks a lot, Linear and Conv1d(kernel size =1) Seems like the features value is equal, but seems the weight and bias initial or backward is different? Or totally same?
They may be different but make almost no difference to final results.
Thaks for your reply!
Thanks for sharing code I have some questions.
Thanks!