Closed MagedSaeed closed 1 year ago
Thanks for the great software.
Just a question to tokenize my text accordingly, how the sentence markers are added internally as mentioned in the docs? Are they added by splits of \n?
lmplz and query treat '\n' in the data as a sentence split. A sentence split implicitly conditions on <s> and appends </s>.
lmplz
query
<s>
</s>
Thanks for your reply and clarification @kpu
Thanks for the great software.
Just a question to tokenize my text accordingly, how the sentence markers are added internally as mentioned in the docs? Are they added by splits of \n?