Thanks for creating and maintaining this great library!
What would you like to be added:
Today, ModelToEncoder calls ModelToEncoding which statically initializes a dictionary of 7 encodings. 6/7 are duplicates.
When each encoding is constructed, the constructor eagerly loads a bunch of data from manifest resources. As far as I can tell, this data gets loaded separately for each instance.
I would like to be able to call ModelToEncoder and only have it lazily load the encoding I care about. Furthermore, I'd like to see it share Encoding instances among models which map to the same encoding.
An example implementation might look like this:
public static Encoding? TryFor(string modelName)
{
switch (modelName)
{
case "gpt-4o":
return O200KCache.Instance;
case "gpt-4":
...
case "text-embedding-3-large":
return Cl100KCache.Instance;
default:
return null;
}
}
private static class O200KCache
{
public static readonly O200KBase Instance = new();
}
private static class Cl100KCache
{
public static readonly Cl100KBase Instance = new();
}
Why is this needed:
Reduce memory footprint and startup time, especially as more models are added.
Anything else we need to know?
I'd be happy to file a PR for this if you're interested!
Thanks for creating and maintaining this great library!
What would you like to be added:
Today,
ModelToEncoder
callsModelToEncoding
which statically initializes a dictionary of 7 encodings. 6/7 are duplicates.When each encoding is constructed, the constructor eagerly loads a bunch of data from manifest resources. As far as I can tell, this data gets loaded separately for each instance.
I would like to be able to call
ModelToEncoder
and only have it lazily load the encoding I care about. Furthermore, I'd like to see it share Encoding instances among models which map to the same encoding.An example implementation might look like this:
Why is this needed:
Reduce memory footprint and startup time, especially as more models are added.
Anything else we need to know?
I'd be happy to file a PR for this if you're interested!