Open rodpayne opened 3 months ago
I also noticed that. Haven't had time to look at it if it's my DNS that was failing. Your workaround #1 could be a quick fix for now.
@rodpayne :
The issue is definitely there. I am currently using parsedmarc
in multiple production workloads, both in Lambda and in other containerised environments, and the fields are empty.
I haven't started looking into the issue yet in more depth, planned to do it on the coming Monday. Given that dig -x
and nslookup
works fine on the IP addresses, and so does parsedmarc
's get_ip_address_info
's reverse DNS lookup, the list of potential culprits is rather narrow.
As for an approach, I think there is no need for updating existing findings in ElasticSearch as:
@rodpayne if you are ok with that, I am happy to work on fixing the core issue, and you can look at writing a separate, standalone snippet which can retrofit existing data in an ElasticSearch/OpenSearch data storage.
I have done some tinkering and it seems like that with the latest version, 8.10.3
the reverse DNS lookup data population is working fine. Previously, with version 8.9.4
the fields were missing.
That being said 8.10.3
introduces a potential security issue as per #500 .
Yes, 8.10.3
does seem to have fixed the reverse lookup problem. (I thought that I had fixed it by changing my Docker networking last night, but I had also built with the current code.)
Item 2 above is in pull request #501.
FYI, I am just starting to use parsedmarc
and have processed over 9 million messages covering the past 30 days, so that is why I was reluctant to start over.
I think the new Reverse DNS lookup on 8.10.3 has created new issues. I have many more invalid reports, then before. Both from Google and Microsoft. So that seems wrong. Digging into it, it looks like the reverse lookup code, when fails to resolve, is failing the whole report as invalid. Before I can see the base_domain field was empty, when the reverse lookup failed. So the report was still parsed, and not total failed. I'm going to do some testing in the next couple of days. But have any other also more invalid reports from 8.10.3 and up ?
I am finding that my DNS reverse lookups are not doing very well. Looking at Grafana, the top listed items in Top 1000 Message Source IP Addresses do not have
Base Domain
orReverse DNS
. If I use nslookup to check the IP address, they almost always come up without problem.I have tried various
nameservers
and increaseddns_timeout,
without seeing much improvement.I am wondering if other users of the (really fantastic)
parsedmarc
are having a similar problem.Has anyone looked at making a script or program to inquire in
elasticsearch
for the top addresses that do not have the info, try looking them up, and then do an update query to replace the records with the updated info? (That is what I may do if I can't figure out what is going wrong with the original lookups.)I am also looking at only updating
IP_ADDRESS_CACHE
when theget_ip_address_info
succeeds in getting the info. That may keep it from compounding the lookup problem.