InseeFrLab / pynsee

pynsee package contains tools to easily search and download french data from INSEE and IGN APIs
https://pynsee.readthedocs.io/en/latest/
MIT License
67 stars 8 forks source link

Population Map example #175

Open elGringo11 opened 10 months ago

elGringo11 commented 10 months ago

Hello have tried to run the example Population Map and got a bunch of errors: PS C:\Users\XXX\Desktop\INSEE_API> & "C:/Program Files/Python311/python.exe" c:/Users/XXX/Desktop/INSEE_API/carte.py API query number limit reached - function might be slowed down Thanks for your help.

hadrilec commented 10 months ago

hello, thanks for your feedback, could you please provide a reproducible example in this issue?

elGringo11 commented 10 months ago

Hello Is that u expect? Thanks


from pynsee.utils.init_conn import init_conn
init_conn(insee_key="XXX", insee_secret="XXX")

from pynsee.geodata import get_geodata_list, get_geodata, GeoFrDataFrame

import math
import geopandas as gpd
import pandas as pd
from pandas.api.types import CategoricalDtype
import matplotlib.cm as cm
import matplotlib.pyplot as plt
import descartes

import warnings
from shapely.errors import ShapelyDeprecationWarning
warnings.filterwarnings("ignore", category=ShapelyDeprecationWarning)

import logging
import sys

logger = logging.getLogger()
logger.setLevel(logging.INFO)
formatter = logging.Formatter('[%(filename)s:%(lineno)s - %(funcName)20s() ] %(message)s')

file_handler = logging.FileHandler('mylogs.log')
file_handler.setLevel(logging.DEBUG)
file_handler.setFormatter(formatter)

logger.addHandler(file_handler)

# get geographical data list
geodata_list = get_geodata_list()
# get departments geographical limits
com = get_geodata('ADMINEXPRESS-COG-CARTO.LATEST:commune')

mapcom = gpd.GeoDataFrame(com).set_crs("EPSG:3857")

mapcom = mapcom.to_crs(epsg=3035)
mapcom["area"] = mapcom['geometry'].area / 10**6
mapcom = mapcom.to_crs(epsg=3857)

mapcom['REF_AREA'] = 'D' + mapcom['insee_dep']
mapcom['density'] = mapcom['population'] / mapcom['area']

mapcom = GeoFrDataFrame(mapcom)
mapcom = mapcom.translate(departement = ['971', '972', '974', '973', '976'],
                          factor = [1.5, 1.5, 1.5, 0.35, 1.5])

mapcom = mapcom.zoom(departement = ["75","92", "93", "91", "77", "78", "95", "94"],
                 factor=1.5, startAngle = math.pi * (1 - 3 * 1/9))
mapcom

mapplot = gpd.GeoDataFrame(mapcom)
mapplot.loc[mapplot.density < 40, 'range'] = "< 40"
mapplot.loc[mapplot.density >= 20000, 'range'] = "> 20 000"

density_ranges = [40, 80, 100, 120, 150, 200, 250, 400, 600, 1000, 2000, 5000, 10000, 20000]
list_ranges = []
list_ranges.append( "< 40")

for i in range(len(density_ranges)-1):
    min_range = density_ranges[i]
    max_range = density_ranges[i+1]
    range_string = "[{}, {}[".format(min_range, max_range)
    mapplot.loc[(mapplot.density >= min_range) & (mapplot.density < max_range), 'range'] = range_string
    list_ranges.append(range_string)

list_ranges.append("> 20 000")

mapplot['range'] = mapplot['range'].astype(CategoricalDtype(categories=list_ranges, ordered=True))

fig, ax = plt.subplots(1,1,figsize=[15,15])
mapplot.plot(column='range', cmap=cm.viridis,
legend=True, ax=ax,
legend_kwds={'bbox_to_anchor': (1.1, 0.8),
             'title':'density per km2'})
ax.set_axis_off()
ax.set(title='Distribution of population in France')
plt.show()

fig.savefig('pop_france.svg',
            format='svg', dpi=1200,
            bbox_inches = 'tight',
            pad_inches = 0)
hadrilec commented 10 months ago

ok thanks, I will have a look once I am back from holidays at the end of November. Do you manage to get the map of France? If the only warning you get, is "Api slowed down", it is fine, otherwise it might be a bug.

elGringo11 commented 10 months ago

Ok. I got a long range of errors. I picked up this one. it may help you. Have nice holidays

RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.  File "<string>", line 1, in <module>

  File "C:\Program Files\Python311\Lib\multiprocessing\spawn.py", line 120, in spawn_main
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\multiprocessing\spawn.py", line 129, in _main
    prepare(preparation_data)
  File "C:\Program Files\Python311\Lib\multiprocessing\spawn.py", line 240, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Program Files\Python311\Lib\multiprocessing\spawn.py", line 291, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen runpy>", line 291, in run_path
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
  File "c:\Users\XXX\Desktop\INSEE_API\carte.py", line 38, in <module>
    com = get_geodata('ADMINEXPRESS-COG-CARTO.LATEST:commune')
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\XXX\AppData\Roaming\Python\Python311\site-packages\pynsee\geodata\get_geodata.py", line 31, in get_geodata
    df = _get_geodata(id=id, update=update, crs=crs)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\XXX\AppData\Roaming\Python\Python311\site-packages\pynsee\geodata\_get_geodata.py", line 173, in _get_geodata
    with multiprocessing.Pool(
         ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\multiprocessing\context.py", line 119, in Pool
    return Pool(processes, initializer, initargs, maxtasksperchild,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\multiprocessing\pool.py", line 215, in __init__
    self._repopulate_pool()
  File "C:\Program Files\Python311\Lib\multiprocessing\pool.py", line 306, in _repopulate_pool
    return self._repopulate_pool_static(self._ctx, self.Process,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\multiprocessing\pool.py", line 329, in _repopulate_pool_static
    w.start()
  File "C:\Program Files\Python311\Lib\multiprocessing\process.py", line 121, in start
    self._popen = self._Popen(self)
                  ^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\multiprocessing\context.py", line 336, in _Popen
    return Popen(process_obj)
           ^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\multiprocessing\popen_spawn_win32.py", line 45, in __init__
    prep_data = spawn.get_preparation_data(process_obj._name)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\multiprocessing\spawn.py", line 158, in get_preparation_data
    _check_not_importing_main()
  File "C:\Program Files\Python311\Lib\multiprocessing\spawn.py", line 138, in _check_not_importing_main
    raise RuntimeError('''
RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.
API query number limit reached - function might be slowed down
API query number limit reached - function might be slowed down
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Program Files\Python311\Lib\multiprocessing\spawn.py", line 120, in spawn_main
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\multiprocessing\spawn.py", line 129, in _main
    prepare(preparation_data)
  File "C:\Program Files\Python311\Lib\multiprocessing\spawn.py", line 240, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Program Files\Python311\Lib\multiprocessing\spawn.py", line 291, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen runpy>", line 291, in run_path
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
  File "c:\Users\XXX\Desktop\INSEE_API\carte.py", line 38, in <module>
    com = get_geodata('ADMINEXPRESS-COG-CARTO.LATEST:commune')
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\XXX\AppData\Roaming\Python\Python311\site-packages\pynsee\geodata\get_geodata.py", line 31, in get_geodata
    df = _get_geodata(id=id, update=update, crs=crs)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\XXX\AppData\Roaming\Python\Python311\site-packages\pynsee\geodata\_get_geodata.py", line 173, in _get_geodata
    with multiprocessing.Pool(
         ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\multiprocessing\context.py", line 119, in Pool
    return Pool(processes, initializer, initargs, maxtasksperchild,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\multiprocessing\pool.py", line 215, in __init__
    self._repopulate_pool()
  File "C:\Program Files\Python311\Lib\multiprocessing\pool.py", line 306, in _repopulate_pool
    return self._repopulate_pool_static(self._ctx, self.Process,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\multiprocessing\pool.py", line 329, in _repopulate_pool_static
    w.start()
  File "C:\Program Files\Python311\Lib\multiprocessing\process.py", line 121, in start
    self._popen = self._Popen(self)
                  ^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\multiprocessing\context.py", line 336, in _Popen
    return Popen(process_obj)
           ^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\multiprocessing\popen_spawn_win32.py", line 45, in __init__
    prep_data = spawn.get_preparation_data(process_obj._name)
hadrilec commented 9 months ago

hi, in the PR #162 I made this commit 68d4c0eed79f89256581bf1d0dda3f724fb56a7e, it should act as a backup in case the multiprocessing used to retrieve geodata fails I hope we can merge the PR in the coming days, and that it would be final fix to the issue you raised

elGringo11 commented 5 months ago

Hi Hadrien,I am looking at getting data from Insee to get this chart Recettes du budget général | Inseewhere can I find IDBANK for these data? I am struggling to retrieve "recettes fiscales brutes", ...ThanksLaurentenvoyé : 19 novembre 2023 à 12:03de : Hadrien Leclerc @.>à : InseeFrLab/pynsee @.>cc : elGringo11 @.>, Author @.>objet : Re: [InseeFrLab/pynsee] Population Map example (Issue #175) hi, in the PR #162 I made this commit 68d4c0e, it should act as a backup in case the multiprocessing used to retrieve geodata fails I hope we can merge the PR in the coming days, and that it would be final fix to the issue you raised—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>

hadrilec commented 5 months ago

hello, you can browse this app to find easily insee series: https://graffiti.lab.sspcloud.fr/ you should click on the tab, graphique a la demande/plot yourself. otherwise, I guess it should be in the dataset COMPTES-ETAT

elGringo11 commented 5 months ago

HelloThanks.Unfortunetly I got these series from COMPTES-ETAT.  I cannot find out "CHARGE D'INTERETS" or something like thatThanks. LaurentDATASETIDBANKKEYFREQINDICATEURNATUREREF_AREAUNIT_MEASURECORRECTIONFREQ_label_frFREQ_label_enINDICATEUR_label_frINDICATEUR_label_enCOMPTES-ETAT001717258M.SCS.CUMUL_DEBUT_ANNEE.FE.EUROS.BRUTMSCSCUMUL_DEBUT_ANNEEFEEUROSBRUTMensuelleMonthlySolde des comptes spéciauxSpecial accounts balanceCOMPTES-ETAT001717257M.REC.CUMUL_DEBUT_ANNEE.FE.EUROS.BRUTMRECCUMUL_DEBUT_ANNEEFEEUROSBRUTMensuelleMonthlyRecettesRevenueCOMPTES-ETAT001717256M.DEP.CUMUL_DEBUT_ANNEE.FE.EUROS.BRUTMDEPCUMUL_DEBUT_ANNEEFEEUROSBRUTMensuelleMonthlyDépensesExpenditureCOMPTES-ETAT001717255M.SGE.CUMUL_DEBUT_ANNEE.FE.EUROS.BRUTMSGECUMUL_DEBUT_ANNEEFEEUROSBRUTMensuelleMonthlySolde général d'exécutionGeneral budget balanceenvoyé : 26 mars 2024 à 13:11de : Hadrien Leclerc @.>à : InseeFrLab/pynsee @.>cc : elGringo11 @.>, Author @.>objet : Re: [InseeFrLab/pynsee] Population Map example (Issue #175) hello, you can browse this app to find easily insee series: https://graffiti.lab.sspcloud.fr/ you should click on the tab, graphique a la demande/plot yourself. otherwise, I guess it should be in the dataset COMPTES-ETAT—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>

hadrilec commented 5 months ago

ok maybe you can have a look at this:https://github.com/hadrilec/financial_market_report/blob/master/code/EU_gov_debt_interest.R or you should send an email to INSEE asking for the correct IDBANK series.