Hi, on https://platform.openai.com/tokenizer new lines are not treated as separate tokens however in this library, they are. I'm wondering which one is correct and if there are any flags or configuration settings I'm overlooking?
For instance this is 5 tokens on the website but 7 tokens using the lib:
Hi, on https://platform.openai.com/tokenizer new lines are not treated as separate tokens however in this library, they are. I'm wondering which one is correct and if there are any flags or configuration settings I'm overlooking?
For instance this is 5 tokens on the website but 7 tokens using the lib: