dotnet / machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
https://dot.net/ml
MIT License
8.91k stars 1.86k forks source link

Tokenizers tests fail to download tokenizer data #7095

Closed ericstj closed 3 months ago

ericstj commented 3 months ago

Build Information

Build: https://dev.azure.com/dnceng-public/cbb18261-c48f-4abb-8651-8cdcb5474649/_build/results?buildId=611349 Build error leg or test failing: Microsoft.ML.Tokenizers.Tests.TiktokenTests.TestCreationUsingModel Pull request: https://github.com/dotnet/machinelearning/pull/7090

Error Message

Fill the error message using step by step known issues guidance.

(build analysis isn't matching up to the instance failure of the theory, so including console log)

{
  "ErrorMessage": "at Microsoft.ML.Tokenizers.Helpers.GetStream(HttpClient client, String url)",
  "ErrorPattern": "",
  "BuildRetry": false,
  "ExcludeConsoleLog": false
}
Unhandled exception. System.Net.Http.HttpRequestException: An error occurred while sending the request.
 ---> System.IO.IOException: Unable to read data from the transport connection: Connection reset by peer.
 ---> System.Net.Sockets.SocketException (54): Connection reset by peer
   --- End of inner exception stack trace ---
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource<System.Int32>.GetResult(Int16 token)
   at System.Net.Security.SslStream.EnsureFullTlsFrameAsync[TIOAdapter](TIOAdapter adapter)
   at System.Net.Security.SslStream.ReadAsyncInternal[TIOAdapter](TIOAdapter adapter, Memory`1 buffer)
   at System.Net.Http.HttpConnection.InitialFillAsync(Boolean async)
   at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpClient.GetStreamAsyncCore(HttpRequestMessage request, CancellationToken cancellationToken)
   at Microsoft.ML.Tokenizers.Helpers.GetStream(HttpClient client, String url) in /Users/runner/work/1/s/src/Microsoft.ML.Tokenizers/Utils/Helpers.netstandard.cs:line 29
   at Microsoft.ML.Tokenizers.Tiktoken.CreateTokenizerForModel(String modelName, IReadOnlyDictionary`2 extraSpecialTokens, Normalizer normalizer) in /Users/runner/work/1/s/src/Microsoft.ML.Tokenizers/Model/Tiktoken.cs:line 820
   at Microsoft.ML.Tokenizers.Tokenizer.CreateTiktokenForModel(String modelName, IReadOnlyDictionary`2 extraSpecialTokens, Normalizer normalizer) in /Users/runner/work/1/s/src/Microsoft.ML.Tokenizers/Tokenizer.cs:line 330
   at Microsoft.ML.Tokenizers.Tests.TiktokenTests.<>c.<TestCreationUsingModel>b__30_1(String name) in /Users/runner/work/1/s/test/Microsoft.ML.Tokenizers.Tests/TitokenTests.cs:line 331
--- End of stack trace from previous location ---
   at Microsoft.DotNet.RemoteExecutor.Program.Main(String[] args) in /_/src/Microsoft.DotNet.RemoteExecutor/src/Program.cs:line 97

Report

Build Definition Test Pull Request
611349 dotnet/machinelearning Microsoft.ML.Tokenizers.Tests.WorkItemExecution dotnet/machinelearning#7090

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
0 1 1

Known issue validation

Build: :mag_right: https://dev.azure.com/dnceng-public/public/_build/results?buildId=611349 Error message validated: [at Microsoft.ML.Tokenizers.Helpers.GetStream(HttpClient client, String url)] Result validation: :white_check_mark: Known issue matched with the provided build. Validation performed at: 3/21/2024 9:21:47 PM UTC