openfisca / openfisca-france-data

France-Data module of OpenFisca
http://www.openfisca.fr/
GNU Affero General Public License v3.0
13 stars 16 forks source link

Missing birth date for one individual in ERFS_FPR 2009 and 2010 causes build to crash #184

Open elie-gerschel opened 4 years ago

elie-gerschel commented 4 years ago

The issue appeared when building either ERFS_FPR 2010 or ERFS_FPR 2009.

Here is what I did:

from openfisca_france_data import france_data_tax_benefit_system from openfisca_france_data.erfs_fpr.get_survey_scenario import get_survey_scenario

tax_benefit_system = france_data_tax_benefit_system

survey_scenario = get_survey_scenario( tax_benefit_system = tax_benefit_system, year = 2009, rebuild_input_data = True, )

Here is what actually happened:

File "C:\Users\elieg\openfisca-france-data\openfisca_france_data\erfs_fpr\input_data_builder\step_03_variables_individuelles.py", line 141, in create_variables_individuelles year = year)

File "C:\Users\elieg\openfisca-france-data\openfisca_france_data\erfs_fpr\input_data_builder\step_03_variables_individuelles.py", line 942, in create_date_naissance 'day': day_birth,

File "C:\Users\elieg\Anaconda3\envs\ipp\lib\site-packages\pandas\util_decorators.py", line 208, in wrapper return func(*args, **kwargs)

File "C:\Users\elieg\Anaconda3\envs\ipp\lib\site-packages\pandas\core\tools\datetimes.py", line 781, in to_datetime result = _assemble_from_unit_mappings(arg, errors, box, tz)

File "C:\Users\elieg\Anaconda3\envs\ipp\lib\site-packages\pandas\core\tools\datetimes.py", line 906, in _assemble_from_unit_mappings raise ValueError("cannot assemble the " "datetimes: {error}".format(error=e))

ValueError: cannot assemble the datetimes: time data '608' does not match format '%Y%m%d' (match)

Data values

Attached are the data values for the two individuals which I think constitute the issue (year of birth is 0, variable "naia")

problem_guy_2009.xlsx problem_guy_2010.xlsx

Context

I identify more as a Economist (I make microsimulations with real populations).

benjello commented 4 years ago

You should add a check on naia valeus using an assert and/or explicitly fix the erroneous values upstream.

benjello commented 4 years ago

@elie-gerschel : I assign you this issue. Get back to me if you can't solve it by yourself.