Description
This pull request addresses the issue #276 , where the encode method in the Encoding class of TikToken was not handling empty input text correctly. Previously, when the input text was empty, the method did not return any tokens, which was inconsistent with the behavior of other tokenizers. To resolve this issue, the encode method has been modified to return the token value corresponding to the special token 'ENDOFTEXT' when the input text is empty.
Changes Made
Modified the encode method in the Encoding class to handle empty input text.
When the input text is empty, the method now returns the token value corresponding to the special token 'ENDOFTEXT'.
Testing
Added tests to verify the correct behavior of the encode method for empty input text.
Ensured that the tests pass successfully.
Description This pull request addresses the issue #276 , where the encode method in the Encoding class of TikToken was not handling empty input text correctly. Previously, when the input text was empty, the method did not return any tokens, which was inconsistent with the behavior of other tokenizers. To resolve this issue, the encode method has been modified to return the token value corresponding to the special token 'ENDOFTEXT' when the input text is empty.
Changes Made Modified the encode method in the Encoding class to handle empty input text. When the input text is empty, the method now returns the token value corresponding to the special token 'ENDOFTEXT'.
Testing Added tests to verify the correct behavior of the encode method for empty input text. Ensured that the tests pass successfully.
Screenshots
Related Issues Closes: #