adelosa / cardutil

Payment cards tools including ISO8583 parser and Mastercard IPM files processing
MIT License
24 stars 4 forks source link

DE43 does not parse international postcodes #10

Closed margaretli27 closed 1 year ago

margaretli27 commented 1 year ago

The DE43 regex makes an assumption about the DE43_POSTCODE field, that it will always be 4-10 contiguous non-whitespace characters. However, not all postal codes particularly for non-US locations adhere to that. For instance, British postal codes can contain a space partway through, which causes the regex match to fail. And Irish postal codes are three characters. I propose updating the default regex to be

r"(?P<DE43_NAME>.+?) *\\(?P<DE43_ADDRESS>.+?) *\\(?P<DE43_SUBURB>.+?) *\\(?P<DE43_POSTCODE>.+?) *(?P<DE43_STATE>.{3})(?P<DE43_COUNTRY>.{3})$"}

For reference, the current regex is

r"(?P<DE43_NAME>.+?) *\\(?P<DE43_ADDRESS>.+?) *\\(?P<DE43_SUBURB>.+?) *\\(?P<DE43_POSTCODE>\S{4,10}) *(?P<DE43_STATE>.{3})(?P<DE43_COUNTRY>.{3})"
adelosa commented 1 year ago

Thanks for reporting this.. I have not looked deeply into international postcode formats so this seems like a reasonable approach.. I can make some updates if you can provide confirmation on the below...

adelosa commented 1 year ago

The final regex I have come up with is:

(?P<DE43_NAME>.+?) *\\(?P<DE43_ADDRESS>.+?) *\\(?P<DE43_SUBURB>.+?) *\\
(?P<DE43_POSTCODE>.{10})(?P<DE43_STATE>.{3})(?P<DE43_COUNTRY>\S{3})$
margaretli27 commented 1 year ago

That looks reasonable to me!

adelosa commented 1 year ago

Great. I'll release an update with this change over the weekend.

adelosa commented 1 year ago

Updated in a1cbf0ee and released as v0.6.1.

margaretli27 commented 1 year ago

Wonderful, thanks!