dotnet / machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
https://dot.net/ml
MIT License
8.92k stars 1.86k forks source link

EnglishRoberta.TokenizeToIds only populates the accumulatedIds if found in cache #7004

Closed stephentoub closed 4 months ago

stephentoub commented 4 months ago

If the data isn't found in the cache, nothing is added to the list. https://github.com/dotnet/machinelearning/blob/4635a862ddd21b3e7de0404f73a897fecb2011a1/src/Microsoft.ML.Tokenizers/Model/EnglishRoberta.cs#L262-L305

stephentoub commented 4 months ago

cc: @tarekgh