Closed andymeneely closed 10 years ago
YES!!! This for some reason was a huge bottleneck overnight. Just to give you a sense of how fast this speeds it up - Developer.sanitize_validate_email
was originally 90 seconds of the 95 seconds spent on CodeReviewParser
on the test data. I'm super excited to see how fast it runs on production tonight.
I do have one simplification for you though - we don't need the added helper class. Let's just do the regex and use local variables. That way we're not allocating millions of EmailAddress objects overnight.
I'll reopen it so you remember to fix it.
Thank you!! If we did need it, you should know that you did it really well - right place, right conventions, good OO
In my performance profiling, I found an interesting result: this line was one of the slowest lines in the entire parser. I'm not sure what the class is doing that's making it so slow, but I'm pretty sure we can handle the data set we currently have with simple regexes.
So let's refactor this method so that it doesn't use Mail::Address at all. Remove the require and just handle it with well-documented regexes. Add any verifies you see fit, although it's pretty well-covered currently.
Oh, and one more thing - add the verify for
commit-bot%chromium.org@gtempaccount.com
- that's the culprit for our null dereference that we've just been ignoring. It's a blacklisted email, and it should be handled as such.