anthill / open-moulinette

Scripts to clean Open-Data.
MIT License
40 stars 20 forks source link

Insee 2016 #58

Closed garaud closed 6 years ago

garaud commented 6 years ago
garaud commented 6 years ago

I've some troubles in mk_data.py.

Traceback (most recent call last):
  File "mk_data.py", line 226, in <module>
    compare_geo(data, revenu_uc, debug=True)
  File "open-moulinette/insee/comparison.py", line 56, in compare_geo
    compare_var(tab1, tab2, var)
  File "open-moulinette/insee/comparison.py", line 32, in compare_var
    assert max(tab1[var].value_counts()) == 1
AssertionError

when reading "les diplomes 2011"

armgilles commented 6 years ago

Still on WIP on your PR.

Some important points :

With this kind of rules, we are allowed to aggregate features on Code Insee, Departement and Region with no lost data.

armgilles commented 6 years ago

You can pull my branch.

Seems good to me to merge

vallettea commented 6 years ago

thanks guys !