cetanu / sender_policy_flattener

Compact large SPF chains into flat blocks of IP addresses
MIT License
35 stars 17 forks source link

fix handling of TXT records #19

Closed yahesh closed 6 months ago

yahesh commented 6 months ago

Currently the crawler joins overlong records with a space.

This breaks SPF record entries that have been split within an entry (e.g. 0.spf0.hubspotemail.net):

"v=spf1 ip4:3.93.157.0/24 ip4:3.210.190.0/24 ip4:18.208.124.128/25 ip4:54.174.52.0/24 ip4:54.174.57.0/24 ip4:54.174.59.0/24 ip4:54.174.60.0/23 ip4:54.174.63.0/24 ip4:139.180.17.0/24 ip4:141.193.184.32/27 ip4:141.193.184.64/26 ip4:141.193.184.128/25 ip4:1" "41.193.185.32/27 ip4:141.193.185.64/26 ip4:141.193.185.128/25 ip4:143.244.80.0/20 ip4:158.247.16.0/20 -all"

This leads to broken SPF entries in the flattened result.

tresni commented 6 months ago

With dnspython 2.4.2 and python 3.10.13, this fix makes no difference as far as I can tell. I'm still getting

"v=spf1 ip4:23.253.182.103 ip4:23.253.183.145 ip4:23.253.183.146 ip4:23.253.183.147 ip4:23.253.183.148 ip4:23.253.183.150 ip4:166.78.68.221 ip4:167.89.46.159 ip4:167.89.64.9 ip4:167.89.65.0 ip4:167.89.65.53 ip4:167.89.65.100 ip4:167.89.74" ".233 ip4:167.89.75.33 ip4:167.89.75.126 ip4:167.89.75.136 ip4:167.89.75.164 ip4:192.237.159.42 ip4:192.237.159.43 ip4:159.112.242.162 ip4:159.135.228.10 -all"

The problem is that str(a) already does the concatenation of the underlying records and the string you are getting contains the quotes and all. You would need to do something to decode and join the strings on the RRset itself. The following works for TXT records but will break on anything else:

["".join([s.decode() for s in a.strings]) for a in answers]
yahesh commented 6 months ago

@tresni Thanks for pointing this out. I hoped this to be a quick win. I reworked the response handling and tokenization instead now. By handling each DNS RR individually during tokenization and giving the tokenizer more context, it's possible to solve this without affecting the handling of other RR types.

yahesh commented 6 months ago

Just as a short info: We have this fix running in production now as a hotpatch via Staffbase/r53spflat and Staffbase/sender_policy_flattener without issues so far.

tresni commented 6 months ago

You might take a look at https://github.com/tresni/sender_policy_flattener . I did a few other things before I gave up and went with a service. There are several more issues with this code. One more example: lack of support for IPv6/AAAA records.

yahesh commented 6 months ago

Closing this PR as it is superseeded by #20.