Open cschuchardt88 opened 1 year ago
i'd love to have it, but it is hard to measure the computational effort required for executing regex operations. It could be very complex and thus be used to DOS the network. Some predefined patterns maybe.
Dotnet only compiles Regular expression once. That's why i say have some predefined ones. Also they can be cached as well.
Read more about it here https://learn.microsoft.com/en-us/dotnet/standard/base-types/compilation-and-reuse-in-regular-expressions#compiled-regular-expressions
I mean, we need to meaure how long it takes to match
, predefined url
number
things do not sounds like a core-job. What i am thinking is somthing like startwith
endwith
contains
, onlycontains
, notcontains
substring
indexof
and format
would be nice too. But i see what your saying. Guess it would depend on the length of the string for regex? But couldn't we just burn gas for regex? For example regex would burn certain amount of gas every millisecond for X milliseconds for given length of string and would timeout after x milliseconds if doesn't finish. Also we could make it not run in dynamic scripts. If it would exceed the amount of time for regex to complete you would get an RegexTimeoutException.
millisecond
very depends on the machines, we can not to it that way. The reason i am thinking break the regex down to couple of commands is that each of the command is easier to be measured in gas consumption.
i'd love to have it, but it is hard to measure the computational effort required for executing regex operations. It could be very complex and thus be used to DOS the network. Some predefined patterns maybe.
Totally agree, difficult to pay for it
@shargon can we atleast have what @Liaojinghui said for strings? startswith
endswith
contains
compare
format
indexof
and substring
@shargon @Jim8y
Got some good news about this. When you compile regex
in dotnet
it generates liasm
code (which is compiled dotnet). So it can be measured easily and not DDOS
able. For example if you do a search for abc
regex
expression, it will use/generate String.IndexOf
for the expression. Also you can add timeouts
. Find out more about it here https://youtu.be/ptKjWPC7pqw?t=1609
DDOS ATTACK
using System.Diagnostics;
using System.Text.RegularExpressions;
Stopwatch sw = new();
for (int i = 1; i <= 40; i++)
{
var r = new Regex($@"^(\w\d|\d\w){{{i}}}$", RegexOptions.None, TimeSpan.FromSeconds(2));
string input = string.Concat(Enumerable.Repeat("11", i)) + "1";
sw.Restart();
r.IsMatch(input);
sw.Stop();
Console.WriteLine("{0}: {1}", i, sw.Elapsed);
}
OUTPUT
1: 00:00:00.0084212
2: 00:00:00.0000213
3: 00:00:00.0000101
4: 00:00:00.0000060
5: 00:00:00.0000338
6: 00:00:00.0000146
7: 00:00:00.0000217
8: 00:00:00.0000458
9: 00:00:00.0000749
10: 00:00:00.0002227
11: 00:00:00.0006073
12: 00:00:00.0006753
13: 00:00:00.0012552
14: 00:00:00.0023305
15: 00:00:00.0049467
16: 00:00:00.0094942
17: 00:00:00.0185833
18: 00:00:00.0599559
19: 00:00:00.0778930
20: 00:00:00.2526117
21: 00:00:00.3932629
22: 00:00:00.6679340
23: 00:00:01.2998858
Unhandled exception. System.Text.RegularExpressions.RegexMatchTimeoutException: The Regex engine has timed out while trying to match a pattern to an input string. This can occur for many reasons, including very large inputs or excessive backtracking caused by nested quantifiers, back-references and other factors.
at System.Text.RegularExpressions.RegexRunner.<CheckTimeout>g__ThrowRegexTimeout|25_0()
at System.Text.RegularExpressions.RegexInterpreter.Backtrack()
at System.Text.RegularExpressions.RegexInterpreter.TryMatchAtCurrentPosition(ReadOnlySpan`1 inputSpan)
at System.Text.RegularExpressions.RegexInterpreter.Scan(ReadOnlySpan`1 text)
at System.Text.RegularExpressions.Regex.ScanInternal(RegexRunnerMode mode, Boolean reuseMatchObject, String input, Int32 beginning, RegexRunner runner, ReadOnlySpan`1 span, Boolean returnNullIfReuseMatchObject)
at System.Text.RegularExpressions.Regex.RunSingleMatch(RegexRunnerMode mode, Int32 prevlen, String input, Int32 beginning, Int32 length, Int32 startat)
at System.Text.RegularExpressions.Regex.IsMatch(String input)
at Program.<Main>$(String[] args) in C:\Users\Chris\source\repos\ConsoleApp1\ConsoleApp1\Program.cs:line 13
@roman-khimov does golang
do something similar? If it can't maybe we shouldn't add it?
I have already supported string methods like: startwith endwith, contains, onlycontains, notcontains
but regular expression is still can not be supported in general.
These time limits look nice in general, but 2s on one machine is 10s on another and we need some strictly reproducible results. It's also hard to establish any fair GAS cost for this call, the difference in execution time can de huge, but we can't rely on time for the cost, we can't rely on length either.
Summary or problem description We need a way to validate strings in a smart contract. Regular expression is the best way to do this and would be the cheapest.
Do you have any solution you want to propose? Add
RegEx
method toStdLib
library for cheap and easy string validation. also have predefined and compiled ones like URL, Numbers, Letters and numbers., to name a few.Neo Version
Where in the software does this update applies to?