e2nIEE / pandapower

Convenient Power System Modelling and Analysis based on PYPOWER and pandas
https://www.pandapower.org
Other
885 stars 485 forks source link

[feature] clean up import #1917

Open SteffenMeinecke opened 1 year ago

SteffenMeinecke commented 1 year ago

Feature Checklist

1. How to import Recommended ways of importing basic pandapower functionaliy, e.g. runpp() should look like (as it is)

# option 1
import pandapower as pp
pp.runpp()  # usage

# option 2
from pandapower import runpp
runpp()  # usage

Recommended ways of importing futher pandapower functionaliy, e.g. from_ppc() should look like

# option 1 (default)
import pandapower as pp
pp.converter.from_ppc()  # usage

# option 2
from pandapower import converter as cv
cv.from_ppc()  # usage

# option 3
from pandapower.converter import from_ppc()
from_ppc()  # usage

Not recommended for usage outside the pandapower package itself (for functions not starting with _):

from pandapower.converter.pypower.from_ppc import from_ppc
from_ppc()  # usage

Reason: pandapower might move or rename the file containing from_ppc

Remark: pp.converter should be first available with https://github.com/e2nIEE/pandapower/pull/1916

TODOs:

2. Remove direct availability of toolbox functions and some others with import pandapower (pp.toolbox.fct() # is recommended)

Will destroy users' code that uses functions in question from import pandapower -> we should be clear that we really want that -> if yes, deprecation warnings are needed, see

TODOs in case we want that: 1st:

2nd:

vogt31337 commented 5 days ago

We should also discuss to maybe introduce a more modular approach. At the moment, if you import pandapower nearly everything gets loaded directly (since many init.py's). I would like to see an approach, where you need to import pandapower.timeseries and only then the timeseries imports are loaded.

SteffenMeinecke commented 1 day ago

I would like to see an approach, where you need to import pandapower.timeseries and only then the timeseries imports are loaded.

I guess to decrease the import time? If one import pandapower once at the beginning of a calculation run, it should be no big deal right now, or am I wrong?

KS-HTK commented 1 day ago

I guess to increase the import time? If one import pandapower once at the beginning of a calculation run, it should be no big deal right now, or am I wrong?

Yes, to increase import time. Importing only the functions used should be alot faster than importing all functions. The issue is that this will currently not improve load time because alot of functions within pandapower actually import the whole package. So even when importing only one function it will most likely load all modules. This can be inspected with python -X importtime -c "<command>".

Example:

python -X importtime -c "from pandapower.plotting.geo import dump_to_geojson"

This will show all loadtimes for each loaded package and function. It also shows that using the import as mentioned will actually not help prevent loading all modules. (Maybe if they are removed from init.py, have not tested that) Currently it shows that the loading process for pandapower as a whole took 4137356μs (4.13s).

I would consider importing for 4.13s before even starting a calculation a big deal, but that is up to interpretation.

The longest none pandapower modules imported were pandas, scipy.stats, scipy.interpolate. pandas took .56s to load, which is significantly faster than 4.13s. scipy should not be required for the dump_to_geojson function. scipy.stats loaded in .37s and scipy.interpolate loaded in .35s. Avoiding loading of all not required modules is a good idea to get the overall load time down, as all loding times are cumulative. So the time for pandapower includes all of its dependencies, as they need to be loaded before pandapower can be loaded.

Detailed list of times These Times are sorted by the cumulative loading time decreasing. The list is capped at the top 100 (from 2967) due to comment length limit of github. All times are given in μs. | Time | Cumulative | Name | |------:|-----------:|:--------------------------------------------------------------------------------------------------| | 19 | 3503805 | pandapower.plotting.geo | | 14 | 3503786 | pandapower.plotting | | 924 | 3503773 | pandapower | | 9826 | 1654868 | pandapower.auxiliary | | 392 | 1061137 | pandapower.convert_format | | 350 | 634267 | pandapower.plotting | | 692 | 555049 | pandas | | 370 | 467930 | pandas.core.api | | 376 | 426299 | pandapower.control | | 399 | 412995 | pandapower.control.basic_controller | | 409 | 412407 | pandapower.control.util.auxiliary | | 316 | 411616 | pandapower.control.util.characteristic | | 247 | 386908 | pandapower.estimation | | 459 | 386661 | pandapower.estimation.state_estimation | | 873 | 369090 | scipy.stats | | 590 | 356921 | scipy.interpolate | | 984 | 336701 | scipy.interpolate._interpolate | | 80931 | 328327 | numba | | 643 | 299046 | pandapower.plotting.collections | | 2296 | 296762 | matplotlib.pyplot | | 12441 | 284672 | scipy.stats._stats_py | | 226 | 283735 | lightsim2grid.newtonpf | | 512 | 276701 | scipy.interpolate._fitpack_py | | 678 | 262713 | scipy.interpolate._bsplines | | 1025 | 259211 | lightsim2grid.newtonpf.newtonpf | | 713 | 253154 | scipy.optimize | | 418 | 226602 | scipy.sparse.csgraph | | 1444 | 219509 | numpy | | 295 | 214376 | pandas._libs | | 657 | 213796 | geopandas | | 399 | 211303 | scipy.stats.distributions | | 345 | 210740 | pandapower.plotting.generic_geodata | | 8584 | 199362 | pandas._libs.interval | | 349 | 189273 | pandapower.file_io | | 11261 | 185359 | pandas._libs.hashtable | | 5748 | 174098 | pandas._libs.missing | | 375 | 172964 | openpyxl | | 436 | 169317 | scipy.sparse.csgraph._laplacian | | 388 | 168881 | scipy.sparse.linalg | | 41 | 162632 | pandas._libs.tslibs.nattype | | 502 | 162591 | pandas._libs.tslibs | | 674 | 153241 | networkx | | 1101 | 152155 | geopandas.geoseries | | 255 | 145581 | pandas.core.groupby | | 1698 | 145326 | pandas.core.groupby.generic | | 6381 | 142864 | pyproj | | 329 | 140912 | openpyxl.workbook | | 675 | 140584 | openpyxl.workbook.workbook | | 365 | 134561 | numba.core.decorators | | 764 | 134197 | numba.stencils.stencil | | 38638 | 128464 | scipy.stats._continuous_distns | | 7740 | 123694 | pandas._libs.tslibs.conversion | | 342 | 122727 | pandapower.plotting.plotly | | 5961 | 120770 | pandas.core.frame | | 449 | 109191 | scipy.sparse.linalg._isolve | | 609 | 106939 | scipy.sparse.linalg._isolve.iterative | | 346 | 105391 | numba.core.registry | | 733 | 105279 | scipy.linalg | | 943 | 104817 | numba.core.dispatcher | | 329 | 104160 | pandapower.plotting.plotly.pf_res_plotly | | 364 | 103579 | pandapower.run | | 1322 | 103547 | pandapower._version | | 1170 | 101377 | importlib.metadata | | 7157 | 100777 | matplotlib | | 4351 | 96239 | pandas.core.generic | | 758 | 95998 | openpyxl.writer.excel | | 8597 | 94874 | pandas._libs.tslibs.offsets | | 333 | 93029 | numpy.random | | 462 | 92696 | numpy.random._pickle | | 1409 | 88035 | matplotlib.colorbar | | 365 | 86882 | pandas.core.arrays | | 4296 | 83751 | matplotlib.figure | | 480 | 79194 | matplotlib.projections | | 9619 | 78843 | pandas._libs.tslibs.timestamps | | 1338 | 75329 | networkx.algorithms | | 71267 | 71613 | openpyxl.packaging.manifest | | 508 | 70487 | scipy.stats._boost | | 397 | 67343 | scipy.optimize._linprog | | 326 | 64362 | pandapower.runpm | | 855 | 64336 | matplotlib.rcsetup | | 1275 | 63570 | scipy.optimize._shgo | | 302 | 63360 | pandapower.converter.pandamodels.from_pm | | 18 | 63059 | pandapower.converter.pandamodels | | 262 | 63041 | pandapower.converter | | 358 | 61721 | numpy.__config__ | | 31 | 61363 | numpy.core._multiarray_umath | | 758 | 61333 | numpy.core | | 8422 | 60582 | numpy.random.mtrand | | 397 | 56478 | power_grid_model_io.converters | | 8585 | 56348 | pandas._libs.tslibs.timedeltas | | 540 | 54785 | geopandas._config | | 452 | 54246 | geopandas._compat | | 953 | 54164 | pandapower.io_utils | | 597 | 53219 | shapely | | 351 | 52870 | pandas.core.arrays.arrow | | 574 | 52673 | pyproj.network | | 1991 | 52129 | scipy.stats._distn_infrastructure | | 1376 | 52106 | pandas.core.arrays.arrow.array | | 32780 | 51342 | pyproj._network | | 1540 | 51184 | igraph |