Closed llCorvinSll closed 9 years ago
In some cases (processing large amount of networks) using static Regex.Match gives large memory overhead and OutOfMemoryExceptions.
Oh, really? It is hard to believe at first. What differences are there between static version Match method and instance version one?
Could you let me show any sample programs that cause OutOfMemoryExceptions, or, authority information resource?
ok. I'm play with VS profiling tools a bit. And get some results
oldApproach
newApproach
test run creating new IpAddresRange in cycle (16383 string rerpesents uniq networks) 4 times
you can see what static Match method allocates more memory when regexp object and so on, i found one more problem: Byte operasions with Linq cause huge amount of copying and slow.
I can be wrong with interpreting profiling result, but modyfied object avoid OOM errors in my complex algorithm
Thank you for your reply and detail report.
This reports are very interesting for me.
Ok, I'm going to merge your pull request, but I should work another task in this week.
Please give me one week.
@llCorvinSll I'm sorry, what do you think about pull request#13?
https://github.com/jsakamoto/ipaddressrange/pull/13/files#diff-84e5184cbe9467997f4166539b02496eR114
That implementation use one shot Regex.Match call.
I'm lost merge this pull request#13. It is my misstake.
request#13 use very complex regex - this can slow down parsing process. And dont compile regex at first. Also, check #20 pull, which I'm avoid creating some of iterators in bit methods (some less memory and speed bump)
(I started benchmarking...: https://github.com/jsakamoto/ipaddressrange-benchmark )
you can share you results?
@llCorvinSll
you can share you results?
Yes. But .vsp files are to large (above 100MB), so I could not push to GitHub it.
I'm planing how to publish .vsp files. I'll upload it to OneDrive.
Please let me few hours or days.
Ok, you can make screenshots, instead of sharing memory dumps
memory overhead are simple: using static method returns new Regex object, inside constructor system try to find cached instance, but new object already created in memory
I did benchmarks.
Screen shots are here: http://1drv.ms/1PsWt7z
Summary of benchmarks score is: http://1drv.ms/1PsW9G0
These benchmarks may tell us "static Regex members approach without LINQ bit operations" is lowest memory usage, about 19MB less than "single Regex pattern approach with LINQ bit operations".
Also, running speed best score is "static Regex members approach with LINQ bit operations", about 80msec less than "single Regex pattern approach with LINQ bit operations".
We should remember these benchmarks are stocking 20thousand IPAddressRange objects, and difference of memory usage is only few MB.
I feel it is over spec...
I'm still not sure about current IPAddressRange implementation causes OutOfMemoryException.
But anyway, I'll planing about merging of this pull request.
Buy the way, I feel "NO LINQ bit operations" implementation (pull request #20) are not beautiful, and it save only about 2MB at treating ten thousand order IPAddressRange instances, so I would not merge pull request #20.
What should I do?
@llCorvinSll
using static method returns new Regex object, inside constructor system try to find cached instance, but new object already created in memory
Thank you for the information.
I understand: Regex.Match()
static method implementation cache only RegexCode
internal objects, but does not cache Regex
objects.
@jsakamoto 2Mb is 12% difference! And and i dont big fan of creating iterators without reason. (bit operations) And, for my taste, linq operations on BITs hard to understand, im prefer to use arrays in this case (: If you dont want to use old way style - reject #20
Hovewer, regex gives huge improvement, and it's ok to be merged.
@llCorvinSll
I merged this pull request #19. Thank you for your contribution :+1:
P.S. see also https://github.com/jsakamoto/ipaddressrange/pull/20#issuecomment-148708579
In some cases (processing large amount of networks) using static Regex.Match gives large memory overhead and OutOfMemoryExceptions.
Putting patterns to static precompilet Regexp solve problem and reduce memory usage