Open poteminr opened 6 months ago
Thank you for your work!
Have you tried other methods for span representation except $CAT[h_i, h_j, D(j-i)]$? Very interesting to know :)
I tried to implement something like $CAT[h_i, hj, MEAN(h{i+1}, h{i+2}, ..., h{j-1})]$ and got terrible training speed (this was expected).
Thank you for your work!
Have you tried other methods for span representation except $CAT[h_i, h_j, D(j-i)]$? Very interesting to know :)
I tried to implement something like $CAT[h_i, hj, MEAN(h{i+1}, h{i+2}, ..., h{j-1})]$ and got terrible training speed (this was expected).