Open nmenardg-keeper opened 1 year ago
This sounds like it would be a suggestion better made for the dnspython library rather than here but I would be curious to hear what you find out.
Yeah agreed.
Here's the issue in dnspython: https://github.com/rthalley/dnspython/issues/913
I'm testing the signal
solution today. Some caveats: it only works on single-thread programs, and could interfere with other code also using signal.SIGALRM
I closed the issue in dnspython. The problem seems to be that in deliverability.validate_email_deliverability
, Resolver.resolve
is called potentially 4 times with the same timeout (due to except dns.resolver.NoAnswer
).
The solutions I see on your side could be:
resolve
call gets 1/4 of the total timeout)resolve
call, and allocate the remaining time (if any) to the following call
a. e.g. first call takes 5s, next call gets a timeout of 15-5 = 10sFor my solution using signal
to work, I'll have to raise a BaseException
because dnspython is catching Exception
Or a hacky fix would be to give email_validator a timeout of 1/4 of what I actually want
Ah sorry for suggesting you head toward dnspython.
Since the except-block for dns.exception.Timeout is around all of the DNS queries, the first query to time out should end the deliverability checks. So to get a total time that's four times the timeout, each query would have to complete in just under the timeout without timing out.
Your option 2 would be OK for me, but it might be a little tricky to implement because the resolver might be provided by the caller and reducing the timeout in future calls would mangle the original value.
It's very hard to reproduce, we found the problem because we have a 60s timeout for our web server that we kept busting, even with email-validator's default 15s timeout.
Disabling
check_availability
solved our problem.The problem
The problem lies within:
( https://github.com/JoshData/python-email-validator/blob/main/email_validator/deliverability.py#L40 )
While debugging, what I can see is that the resolver goes through a loop, and only checks the timeout between calls. It also does some
time.sleep()
That means that if a call or a sleep is longer than the timeout, it doesn't get interrupted and can thus run for longerThe solution
Using
signal
, we could interrupt the process. See https://stackoverflow.com/a/494273This package does it: https://github.com/pnpnpn/timeout-decorator
I'll try implementing the stackoverflow suggestion on my end and write back with news