dotnet / machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
https://dot.net/ml
MIT License
8.92k stars 1.86k forks source link

Tiktoken should support being created without synchronous I/O and with user supplied data #7008

Closed stephentoub closed 4 months ago

stephentoub commented 4 months ago

If a developer wants to create a Tiktoken from something other than CreateByModelNameAsync, they're forced to use the Tiktoken constructor, which does synchronous I/O. There should be a factory equivalent to its constructors that use async I/O. There should likely also be a constructor that accepts the data already in memory rather than needing to load it separately.

stephentoub commented 4 months ago

cc: @tarekgh