Open denizerden opened 3 years ago
Hi @denizerden ,
normally, longer sentences will be truncated then. You could use the BERTurk models with a larger vocab (128k):
With these models you should be able to use "longer" sentences, because the tokenizer uses less subtokens per token in theory (compared to the "normal" BERTurk models that have a 32k vocab).
How to user bert turkish sentiment cased model for calculating sentiment scores of sentences with more than 512 sequence length?