bchavez / Bogus

:card_index: A simple fake data generator for C#, F#, and VB.NET. Based on and ported from the famed faker.js.
Other
8.66k stars 495 forks source link

Bogus generates invalid Finnish Henkilötunnus #392

Closed niklashaetty closed 8 months ago

niklashaetty commented 3 years ago

Version Information

Software Version(s)
Bogus NuGet Package 33.1.1
.NET Core? 3.1
.NET Full Framework?
Windows OS? N/A
Linux OS? N/A
Visual Studio? N/A

What locale are you using with Bogus?

Default, en

What is the expected behavior?

API Extension method Bogus.Person.Henkilötunnus() to return a valid Henkilötunnus.

What is the actual behavior?

I believe the algorithm for calculating the control digit is wrong.

Please provide a stack trace.

N/A

Any possible solutions?

Implement the correct algorithm for the control digit: https://en.wikipedia.org/wiki/National_identification_number#Finland

How do you reproduce the issue?

   var person = new Faker<Person>()
                .RuleFor(p => p.DateOfBirth, f => DateTime.Parse("1989-07-14"))
                .CustomInstantiator(f => new Person())
                .Generate();

            var hetu = person.Henkilötunnus();

will generate a control digit of I (upper case i) which is not an allowed character

Do you have a unit test that can demonstrate the bug?

        [Fact]
        public void ShouldProduceValidHetu()
        {
            var person = new Person();
            var hetu = person.Henkilötunnus();
            var expectedControlDigit = CalculateControlDigit(hetu);
            var actualControlDigit = hetu[^1];
            Assert.Equal(expectedControlDigit, actualControlDigit);
        }

        private char CalculateControlDigit(string hetu)
        {
            const string allowedCharacters = "0123456789ABCDEFHJKLMNPRSTUVWXY";
            var removedDelimiters = hetu
                .Replace("-", string.Empty)
                .Replace("+", string.Empty)
                .Replace("A", string.Empty);

            var ddMMyyzzz =  
                removedDelimiters
                .Remove(removedDelimiters.Length - 1);
            var n = int.Parse(ddMMyyzzz) % 31;
            return allowedCharacters[n];
        }

Can you identify the location in Bogus' source code where the problem exists?

Bogus.Extensions.Finland.Henkilötunnus()

         var ddMMyy = $"{p.DateOfBirth:ddMMyy}";

         var n = int.Parse(ddMMyy) % 31;

         var q = n.ToString();
         if( n >= 10 )
            q = ((char)('A' + (n - 10))).ToString();

         //no idea if its female or male.
         var zzz = r.Replace("###");

zzz is created after the control digit has been set, which is wrong because the control digit depends on the zzz value.

If the bug is confirmed, would you be willing to submit a PR?

Yes, I wrote an implementation i think works above.

bchavez commented 3 years ago

Hi @niklashaetty , yes, this looks like a bug. At least according to:

https://fi.wikipedia.org/wiki/Henkil%C3%B6tunnus

it looks like the check character needs to use this table for a lookup value. Can you submit a PR with only the necessary changes to fix the issue?

If not, I can try to get to it this weekend.

Thanks, Brian

niklashaetty commented 3 years ago

@bchavez Yup, i fixed it on a local branch. Not allowed to push it to remote though. Do you have any restrictions in place?

bchavez commented 3 years ago

@niklashaetty nice. glad you have a fix. correct, this repo is protected (by default, on GitHub).

to contribute on GitHub:

and that should do it.