Closed jkkl closed 3 years ago
The alignment would be useful for improving the local interactions between the adjacent tokens. I simply concatenated them with the token embeddings, but did not see obvious improvements.
@cooelf Thanks for your Reply。In my scenario,compared to bert-base-chinese, it has improved by abount 5 points.
first_token_tensor = sequence_output[:, 0]