ietf-wg-dmarc / draft-ietf-dmarc-aggregate-reporting

3 stars 4 forks source link

Update dmarc-xml-0.2.xsd #19

Closed vanderlee closed 1 month ago

vanderlee commented 1 month ago

IPv4 regular expression does not match IPv4 addresses due to the dot (.) being matched as part of the '25[0-5]' branch instead of the 0-255 number pattern.

alevesely commented 1 month ago

The file I have on my fork has these lines:

<!-- IPv4 -->
<xs:pattern value=
"((((0{1,4}:){5})|(::(0{1,4}:){0,4}))([Ff]{4}:))?(([01]?\d?\d|2[0-4]\d|25[0-5])\.){3}([01]?\d?\d|2[0-4]\d|25[0-5])"/> 

It allows IPv4 mapped IPv6 addresses, as well as having the extra parenthesis.

vanderlee commented 1 month ago

I'm fine either way.

Your pattern does not parse all valid dual IPv6/IPv4 formats though (which, admittedly, mine doesn't even attempt). For instance the 64:ff9b::192.0.2.128 example listed on Wikipedia or the examples listed on https://www.ibm.com/docs/en/ts4500-tape-library?topic=functionality-ipv4-ipv6-address-formats.

alevesely commented 1 month ago

You're right, albeit I'm not clear whether such addresses can be used globally.

An alternative stance would be to remove all alternatives and replace them with the single, simple pattern:

<xs:pattern value="[0-9a-fA-F.:]{2,45}"/>
vanderlee commented 1 month ago

What level of strictness is required here? I'm noticing the normal IPv6 patterns are commented as "lax" but obviously more strict than [0-9a-fA-F.:]{2,45}.

jrlevine commented 1 month ago

I think the lax pattern is fine. The exact RE for IPv6 addresses is way more complex than it's worth here.

vanderlee commented 1 month ago

Something like (::)?(([A-Fa-f\d]{1,4}::?){1,6})?(([01]?\d?\d|2[0-4]\d|25[0-5])\.){3}([01]?\d?\d|2[0-4]\d|25[0-5]) (based largely on @alevesely's version) is quite lax but covers any dual IPv6 and IPv4 I've thrown at it. Can probably be a bit simplified. It will accept some silliness like ::1::1::1.1.1.1 as well, though.

alevesely commented 1 month ago

If we wanted to be strict, we should also avoid addresses like 10.0.0.0/8, 2001:db8::/32 and a bunch of other reserved and special purpose addresses that would make no sense in an aggregate report. Conveying this by means of regular expression, albeit being an interesting exercise, adds very little value to either possible check tools based on the formal syntax or the readability of the syntax itself.

vanderlee commented 1 month ago

If we don't want to be strict but just have some minimal sanity check, then @alevesely's single RE should do fine.

abrotman commented 1 month ago

Hey folks, a bunch of back and forth. Is the PR going to merge the regex everyone wants? If so, I'll merge it, and get a new rev published.

alevesely commented 1 month ago

Should we ask on list whether to turn to the much simpler one-liner shown above?

d-javu commented 1 month ago

I avoided thinking about the IPv6 regex in my batch of changes, as the amount of syntax was just overwhelming. The IPv4 mapped IPv6 syntax should go in the IPv6 part of the grammar, if the lax pattern is not adopted to replace it all.

abrotman commented 1 month ago

Okay, merged