Closed tonybaloney closed 3 months ago
The only way to instantiate the Tiktoken tokenizer without a vocab stream is to use the Async method https://github.com/dotnet/machinelearning/blob/main/src/Microsoft.ML.Tokenizers/Model/Tiktoken.cs#L778C9-L782C95
Task<Tokenizer> CreateByModelNameAsync( string modelName, IReadOnlyDictionary<string, int>? extraSpecialTokens = null, Normalizer? normalizer = null, CancellationToken cancellationToken = default)
Please can there be an overload to the Sync CreateByModelName() method so that we can instantiate a Tiktoken tokenizer from just a model name without having to call it asynchronously.
CreateByModelName()
I'm using version 0.22.0-preview.24162.2
The only way to instantiate the Tiktoken tokenizer without a vocab stream is to use the Async method https://github.com/dotnet/machinelearning/blob/main/src/Microsoft.ML.Tokenizers/Model/Tiktoken.cs#L778C9-L782C95
Please can there be an overload to the Sync
CreateByModelName()
method so that we can instantiate a Tiktoken tokenizer from just a model name without having to call it asynchronously.