personnummer / java

Validate Swedish personal identity numbers
MIT License
8 stars 6 forks source link

False positives. Not validating the - and + signs #15

Closed ullenius closed 5 years ago

ullenius commented 5 years ago

The program disregards the - and + sign in personal numbers (personnummer). It is used for determining what century the person was born in. It gets changed from a dash - to a plus sign + when a person turns 100 years old.

For example: 19130401-2931 returns true 19130401+2931 returns true

Only the latter should return true.

References: Personnummer och samordningsnummer (Skatteverket) - in Swedish

Johannestegner commented 5 years ago

Hi and thank you for the issue report!

Personnummer & Samordningsnummer are YYMMDD+-XXXX, so if you use the millennial and centurial numbers it's no longer valid.
Does it give the same result if you use 130401-2931 and 130401+2931?

Not really sure how we should handle the case though... @frozzare any input?

ullenius commented 5 years ago

Does it give the same result if you use 130401-2931 and 130401+2931?

Yes. Both return true.

Found some code in another repo that seems to address the issue (uses same licence so it can be used here as well). Haven't tested it though

Johannestegner commented 5 years ago

About to put the kids to bed, but will take a look at this after talking to @frozzare about possible solutions asap :)

Johannestegner commented 5 years ago

Hi! We have had a long discussion about this now, hehe... What we have concluded is the following:

As of right now, using 13 characters as input (MCDYMMDD+/-XXXX) is seen as a so called undefined behaviour as it is not really a valid social security number.
We are still discussing how we want it to act in case this is done though, as we would rather see all the packages behave the same way in this kind of instances (discussion in https://github.com/personnummer/meta/issues/18 and soon another issue).

The good thing though, is that both the numbers will always be equally valid when it comes to the luhn algorithm, as we always remove the millennial and centurial numbers before doing it either way.

The undefined behaviour exists in if the parser chooses the +/- or the year when something is passed, that is: 19130401-2931 could either be parsed as 1913 or 2013 (which in validity does not matter either way).

I will close this issue for now and reference it in the nasal demon discussion. :)