kssteven418 / LTP

[KDD'22] Learned Token Pruning for Transformers
https://arxiv.org/abs/2107.00910
Apache License 2.0
93 stars 17 forks source link

will token number becom larger when fix threshold (hard training step)? #3

Open DreamsofGg opened 3 years ago

DreamsofGg commented 3 years ago

it seems that the model will tend to make the token number larger when fix threshold (hard training step) because it cannot take L1 loss into account. How to solve this problem?