arthurdejong / python-stdnum

A Python library to provide functions to handle, parse and validate standard numbers.
https://arthurdejong.org/python-stdnum/
GNU Lesser General Public License v2.1
499 stars 208 forks source link

Add Kenya TIN number #300

Closed unho closed 2 years ago

unho commented 2 years ago

It is called PIN (Personal Identification Number) and is issued by the KRA (Kenya Revenue Authority).

Online checker: https://itax.kra.go.ke/KRA-Portal/pinChecker.htm

Validation code:

Examples:

arthurdejong commented 2 years ago

There appears to be some kind of check digit algorithm used for calculating the last letter.

An increase in the last digit by one increases the check digit by one (with a wrap-around):

P000609340W P000609341X P000609342Y P000609344A P000609345B P000609346C P000609348E

I haven't found any examples to the contrary in the set of numbers provided.

However, the influence of the second to last digit is a bit more mysterious:

P000609300O P000609320S P000609330U P000609340W P000609350A P000609360C P000609370E P000609380G P000609390I

It increases consistently by 2 characters, except in this case from 4 to 5 it goes from W to A instead of to Y. Another example:

P000609601S P000609621W P000609631Y P000609641A P000609651E P000609661G P000609671I P000609681K P000609691M

Again the change from 4 to 5 goes from A to E instead of the expected C.

The third digit from the right seems to affect the check digit the same as the last digit:

P051126154W P051126254X P051126354Y P051126454Z P051126554A P051126654B

4th digit has the strange *2 again:

P051092067Y P051093067A P051094067C P051095067G P051096067I P051097067K P051099067O

5th digit seems regular again:

P051902053D P051912053E P051922053F P051932053G

6th digit strange again:

P051399934S P051599934Y P051699934A P051799934C P051899934E

7th digit seems regular again:

A001395263W A002395263X A003395263Y A004395263Z A005395263A

The last (first) two digits are pretty difficult to check because I couldn't find consecutive numbers easily (but it is probably safe to assume the pattern repeats). I'm pretty sure an algorithm must be able to constructed from the above but I haven't found the right numbers yet.

Btw, there appear to be a couple of numbers that do not validate the online check (e.gA005109050I, A051320913J, P000610622O, P051099232O, P051550157P and P057170514E).

unho commented 2 years ago

I haven't been able to find any data on this. All sources point to use the online checker https://itax.kra.go.ke/KRA-Portal/pinChecker.htm

arthurdejong commented 2 years ago

I've also been using that online checker (I've been training my math skills ;) ).

I've found a counter-example that more or less makes the existence of an algorithm that is valid for all numbers unlikely:

P051124235M P051124236N P051124237Z <- would expect 7O here P051124238P P051124239Q

So for now, I'm giving up on finding an actual check digit algorithm and while there probably is one I expect there are a sufficiently large number of numbers that are not generated according to that algorithm.

unho commented 2 years ago

That's sad. But after all I was not expecting here to be a check character, since there was no reference to it, and some countries just assign a sequential number with no check character to prevent typos.

I am happy that at least we can validate the format.