google / libaddressinput

Google’s postal address library, powering Android and Chromium
Apache License 2.0
572 stars 103 forks source link

Dutch (NL) postal code regex matches invalid values #208

Open roelofr opened 3 years ago

roelofr commented 3 years ago

Hello there,

Following a discussion at work, I looked into a package used for postal code validation (axlon/postal-code-validation), which is seeding data from the Google Address Data Service (as pointed out in axlon/postal-code-validation#27).

After a short discussion, @wotta pointed out that the regex used is invalid. The regex (\d{4} ?[A-Z]{2}) allows the following invalid postal codes:

A better regex for Dutch postal codes would be [1-9]\d{3} ?(?!SA|SD|SS)[A-Z]{2}, which explicitly excludes zero-based postal codes and invalid suffixes.

I made some tests on regex101.com (click on "Unit Tests" in the left sidebar).

bojanz commented 1 year ago

Note to Go implementors: [1-9]\d{3} ?(?!SA|SD|SS)[A-Z]{2} has unsupported syntax, regexp.Compile gives: Invalid or unsupported Perl syntax: (?!.

roubert commented 1 year ago

(?! is the Perl syntax for a zero-width negative lookahead assertion.