dotnet / machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
https://dot.net/ml
MIT License
9.05k stars 1.88k forks source link

Add Timeout to Regex used in the tokenizers #7284

Closed tarekgh closed 2 weeks ago

codecov[bot] commented 2 weeks ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 68.87%. Comparing base (a9b4212) to head (e294d63). Report is 2 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #7284 +/- ## ========================================== - Coverage 68.87% 68.87% -0.01% ========================================== Files 1467 1467 Lines 273955 273954 -1 Branches 28380 28380 ========================================== - Hits 188697 188693 -4 - Misses 77946 77953 +7 + Partials 7312 7308 -4 ``` | [Flag](https://app.codecov.io/gh/dotnet/machinelearning/pull/7284/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=dotnet) | Coverage Δ | | |---|---|---| | [Debug](https://app.codecov.io/gh/dotnet/machinelearning/pull/7284/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=dotnet) | `68.87% <100.00%> (-0.01%)` | :arrow_down: | | [production](https://app.codecov.io/gh/dotnet/machinelearning/pull/7284/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=dotnet) | `63.33% <100.00%> (-0.01%)` | :arrow_down: | | [test](https://app.codecov.io/gh/dotnet/machinelearning/pull/7284/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=dotnet) | `89.18% <ø> (+<0.01%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=dotnet#carryforward-flags-in-the-pull-request-comment) to find out more. | [Files with missing lines](https://app.codecov.io/gh/dotnet/machinelearning/pull/7284?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=dotnet) | Coverage Δ | | |---|---|---| | [...soft.ML.Tokenizers/Model/SentencePieceTokenizer.cs](https://app.codecov.io/gh/dotnet/machinelearning/pull/7284?src=pr&el=tree&filepath=src%2FMicrosoft.ML.Tokenizers%2FModel%2FSentencePieceTokenizer.cs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=dotnet#diff-c3JjL01pY3Jvc29mdC5NTC5Ub2tlbml6ZXJzL01vZGVsL1NlbnRlbmNlUGllY2VUb2tlbml6ZXIuY3M=) | `76.86% <ø> (ø)` | | | [...Microsoft.ML.Tokenizers/Model/TiktokenTokenizer.cs](https://app.codecov.io/gh/dotnet/machinelearning/pull/7284?src=pr&el=tree&filepath=src%2FMicrosoft.ML.Tokenizers%2FModel%2FTiktokenTokenizer.cs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=dotnet#diff-c3JjL01pY3Jvc29mdC5NTC5Ub2tlbml6ZXJzL01vZGVsL1Rpa3Rva2VuVG9rZW5pemVyLmNz) | `78.28% <100.00%> (ø)` | | | [...crosoft.ML.Tokenizers/PreTokenizer/PreTokenizer.cs](https://app.codecov.io/gh/dotnet/machinelearning/pull/7284?src=pr&el=tree&filepath=src%2FMicrosoft.ML.Tokenizers%2FPreTokenizer%2FPreTokenizer.cs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=dotnet#diff-c3JjL01pY3Jvc29mdC5NTC5Ub2tlbml6ZXJzL1ByZVRva2VuaXplci9QcmVUb2tlbml6ZXIuY3M=) | `92.68% <100.00%> (ø)` | | | [...ft.ML.Tokenizers/PreTokenizer/RegexPreTokenizer.cs](https://app.codecov.io/gh/dotnet/machinelearning/pull/7284?src=pr&el=tree&filepath=src%2FMicrosoft.ML.Tokenizers%2FPreTokenizer%2FRegexPreTokenizer.cs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=dotnet#diff-c3JjL01pY3Jvc29mdC5NTC5Ub2tlbml6ZXJzL1ByZVRva2VuaXplci9SZWdleFByZVRva2VuaXplci5jcw==) | `87.23% <ø> (ø)` | | ... and [6 files with indirect coverage changes](https://app.codecov.io/gh/dotnet/machinelearning/pull/7284/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=dotnet)
tarekgh commented 2 weeks ago

Thanks @stephentoub for your feedback and review!