Open JunrQ opened 3 years ago
As shown in the released log file, the non-local have parameter size 35 which is 25% more than its baseline (near 28).
Have you compared the model with similar parameter size?
As shown in the released log file, the non-local have parameter size 35 which is 25% more than its baseline (near 28).
Have you compared the model with similar parameter size?