Open abgoldberg opened 4 months ago
I think the problem is more related to the DNS resolver on the PTR lookup, then get_base_domain call. get_base_domain is done local on a PSL table. The slow DNS PTR lookup is fixed in the new version, that introduce DNS cache.
I have found that the performance can become quite slow when processing a large number of reports with DNS queries and reverse dns base domain computation.
It turns out this is due to this line:
https://github.com/domainaware/parsedmarc/blob/7d2b431e5f20bdcdb330c4fbb23ce7df5fb0642f/parsedmarc/utils.py#L95C5-L95C46
Instantiating the
psl
object in every call of the function leads to parsing the whole PSL and is quite slow. This would be better pulled out of the function and the same instance used for everyget_base_domain
call.