Since switching to the Spamhaus DQS plugin, I've noticed that it takes SpamAssassin far longer to scan some messages containing many URLs in the body. For example, the message at https://gist.github.com/robertmathews/47223b49aab854ad5a7d046f139c77c8 takes several minutes to scan:
time spamassassin -t < spamassassin-pathological
...
real 6m26.634s
user 0m4.635s
sys 0m0.198s
I traced this problem to the use of direct synchronous "gethostbyname" calls in the SH.pm code. It can hang for up to 30 seconds on each domain name with nonworking nameservers.
To fix this, I replaced the gethostbyname calls with asynchronous lookups and callbacks, the same way the included SpamAssassin "URIDNSBL.pm" does. Now the same message scans in the 4 seconds I'd expect, and gives identical results in terms of scoring:
time spamassassin -t < spamassassin-pathological
...
real 0m3.996s
user 0m2.732s
sys 0m0.116s
Since switching to the Spamhaus DQS plugin, I've noticed that it takes SpamAssassin far longer to scan some messages containing many URLs in the body. For example, the message at https://gist.github.com/robertmathews/47223b49aab854ad5a7d046f139c77c8 takes several minutes to scan:
time spamassassin -t < spamassassin-pathological ... real 6m26.634s user 0m4.635s sys 0m0.198s
I traced this problem to the use of direct synchronous "gethostbyname" calls in the SH.pm code. It can hang for up to 30 seconds on each domain name with nonworking nameservers.
To fix this, I replaced the gethostbyname calls with asynchronous lookups and callbacks, the same way the included SpamAssassin "URIDNSBL.pm" does. Now the same message scans in the 4 seconds I'd expect, and gives identical results in terms of scoring:
time spamassassin -t < spamassassin-pathological ... real 0m3.996s user 0m2.732s sys 0m0.116s