Closed tarekgh closed 4 months ago
@michaelgsharp I appreciate it if you could review the changes. I have removed a couple of APIs you introduced earlier and provided a workaround for their usage. Thank you!
CC @luisquintanilla @stephentoub @ericstj @LittleLittleCloud
Attention: 104 lines
in your changes are missing coverage. Please review.
Comparison is base (
64523e8
) 68.81% compared to head (7c61933
) 68.80%.
This update encompasses the following:
Tokenizer.GetEncodedIdsCount
API, essential for supporting crucial scenarios and implemented it in all supported tokenizers.EncodeToIds
andGetEncodedIdsCount
has been customized for other tokenizer models likeBpe
andEnglishRoberta
. This adaptation aims to enhance the performance of these APIs specifically when invoked from those respective tokenizers.