Open ChrisPenner opened 4 years ago
This might be a good thing for someone to look into for #hacktoberfest
I'd be surprised if anyone seriously participates in that, seems like it's more of a spam-fest..
That's an interesting article, and it's probably true for much of the github ecosystem, but I have a feeling Haskell repos are likely shielded at least a bit from poor quality submissions (Haskell doesn't even show up in the list on the main site).
A quick glance over closed hacktoberfest issues in Haskell shows a surprising amount of legitimate work happening 😄
Granted, this isn't so much of a "quick win"
Referenced in https://github.com/ChrisPenner/lens-regex-pcre/issues/10 with benchmarks here: https://github.com/jamesdbrock/replace-benchmark#results
We can see that pcre-heavy is having a bit of trouble with substitutions with LOTS of matches, I did a quick profile on it and the VAST majority of the time is spent inside
rawMatch
inscanRanges0
. On large strings; it clocks in at OVER 10 minutes, whereas the next longest time is just over one second.After a cursory look, it seems it is serially calling out to C for each consecutive match, and just iterating on that, there's also a lot of fiddly math going on, but nothing that seems to be the "obvious" culprit. Perhaps there's some unknown copying going on too, but I didn't dig deep enough to see.
I can dig around a bit more, but if anyone has suggestions or tips on a good way to speed this up, or even ideas of where the cost centers might be I'm open to ideas!
I understand that the benchmark in question represents a sort of "perfect storm" in terms of triggering bad performance, but it would be nice if there's some way we can get this into at least the same ballpark :)