will token number becom larger when fix threshold (hard training step)?

kssteven418 / LTP

[KDD'22] Learned Token Pruning for Transformers

https://arxiv.org/abs/2107.00910

Apache License 2.0

93 stars 17 forks source link

will token number becom larger when fix threshold (hard training step)? #3

Open DreamsofGg opened 3 years ago

DreamsofGg commented 3 years ago

it seems that the model will tend to make the token number larger when fix threshold (hard training step) because it cannot take L1 loss into account. How to solve this problem?