Open TIPTOEHIGHERZ opened 1 month ago
In the original paper it says that the two embedding layers share weights, but I fail to find any implemention about share weight in the code.
In the original paper it says that the two embedding layers share weights, but I fail to find any implemention about share weight in the code.