arthurdejong / python-stdnum

A Python library to provide functions to handle, parse and validate standard numbers.
https://arthurdejong.org/python-stdnum/
GNU Lesser General Public License v2.1
484 stars 203 forks source link

test failures after 2038 #431

Open bmwiedemann opened 3 months ago

bmwiedemann commented 3 months ago

While working on reproducible builds for openSUSE, I found that our python-stdnum-1.20 package sometimes fails its tests and it might be related to the date the tests are run.

The observed bad build-start dates (in UTC) so far were 2038-04-12T03:26:46 2039-10-18T00:10:26 2040-04-24T18:41:17

test output was

=================================== FAILURES ===================================
___________________ [doctest] test_no_fodselsnummer.doctest ____________________
085 InvalidComponent: The number does not contain valid birth date information.
086 >>> fodselsnummer.get_birth_date('45014054018')
087 Traceback (most recent call last):
088   ...
089 InvalidComponent: The birthdate century cannot be determined
090 >>> fodselsnummer.get_birth_date('82314251342')
091 Traceback (most recent call last):
092   ...
093 InvalidComponent: This number is an FH-number, and does not contain birth date information by design.
094 >>> fodselsnummer.validate('18103970861')
Expected:
    Traceback (most recent call last):
      ...
    InvalidComponent: The birth date information is valid, but this person has not been born yet.
Got:
    '18103970861'

/home/abuild/rpmbuild/BUILD/python-stdnum-1.20/tests/test_no_fodselsnummer.doctest:94: DocTestFailure
____________________ [doctest] test_se_personnummer.doctest ____________________
059 
060 >>> personnummer.get_birth_date('8803200420')
061 datetime.date(1988, 3, 20)
062 >>> personnummer.get_birth_date('191705120424')
063 datetime.date(1917, 5, 12)
064 >>> personnummer.get_birth_date('121212-1212')
065 datetime.date(2012, 12, 12)
066 >>> personnummer.get_birth_date('121212+1212')     
067 datetime.date(1912, 12, 12)
068 >>> personnummer.get_birth_date('400606+5827')   
Expected:
    datetime.date(1840, 6, 6)
Got:
    datetime.date(1940, 6, 6)

/home/abuild/rpmbuild/BUILD/python-stdnum-1.20/tests/test_se_personnummer.doctest:68: DocTestFailure

---------- coverage: platform linux, python 3.9.18-final-0 -----------
Name                         Stmts   Miss Branch BrPart  Cover   Missing
------------------------------------------------------------------------
stdnum/no/fodselsnummer.py      63      1     30      1    98%   126
stdnum/se/personnummer.py       51      2     14      0    97%   89-90
------------------------------------------------------------------------
TOTAL                         7326      3   2527      1    99%
317 files skipped due to complete coverage.
Coverage HTML written to dir coverage

FAIL Required test coverage of 100.0% not reached. Total coverage: 99.96%
=========================== short test summary info ============================
FAILED tests/test_no_fodselsnummer.doctest::test_no_fodselsnummer.doctest
FAILED tests/test_se_personnummer.doctest::test_se_personnummer.doctest
=================== 2 failed, 372 passed, 9 skipped in 7.85s ===================
arthurdejong commented 2 months ago

Yes this is a known issue.It is a combination between these numbers being strongly time-bound (what is considered a valid number today might not be 10 years from now because the laws may have changed) combined with some ambiguity in their definition (we try to reconstruct a birth date but because for years often only 2 digits are stored we could be 100 years off) means that some tests are time-bound.

bmwiedemann commented 2 months ago

Could the tests be updated to state some truths that remain true? E.g. "in the year 2024, a birth year of 98 expands to 1998"

arthurdejong commented 1 week ago

Your best bet is probably to run the tests with a fixed date/time by mocking the system time. I want to avoid that in the development branch because that means we will likely miss these kind of changes.

If there is a way this can be made easier in python-stdnum (e.g. some environment variable that ensures the time is set to a particular date or something), please let me know. Even better would be to provide a fix 😉 because I sadly only have very limited time at the moment.