Closed chris-b1 closed 7 years ago
do these in independent processes
Oh, right - matplotlib
one already was, updated the top for numpy.
or better yet import numpy and matplotlib first then run it
We do attempt to import matplotlib at import time. We could delay that with something like
diff --git a/pandas/plotting/__init__.py b/pandas/plotting/__init__.py
index c3cbedb0f..8f98e297e 100644
--- a/pandas/plotting/__init__.py
+++ b/pandas/plotting/__init__.py
@@ -4,12 +4,6 @@ Plotting api
# flake8: noqa
-try: # mpl optional
- from pandas.plotting import _converter
- _converter.register() # needs to override so set_xlim works with str/number
-except ImportError:
- pass
-
from pandas.plotting._misc import (scatter_matrix, radviz,
andrews_curves, bootstrap_plot,
parallel_coordinates, lag_plot,
diff --git a/pandas/plotting/_core.py b/pandas/plotting/_core.py
index 391fa377f..9821c89c4 100644
--- a/pandas/plotting/_core.py
+++ b/pandas/plotting/_core.py
@@ -37,12 +37,7 @@ from pandas.plotting._tools import (_subplots, _flatten, table,
_get_xlim, _set_ticks_props,
format_date_labels)
-
-if _mpl_ge_1_5_0():
- # Compat with mp 1.5, which uses cycler.
- import cycler
- colors = mpl_stylesheet.pop('axes.color_cycle')
- mpl_stylesheet['axes.prop_cycle'] = cycler.cycler('color', colors)
+_registered = False
def _get_standard_kind(kind):
@@ -92,6 +87,7 @@ class MPLPlot(object):
secondary_y=False, colormap=None,
table=False, layout=None, **kwds):
+ self._setup()
self.data = data
self.by = by
@@ -175,6 +171,20 @@ class MPLPlot(object):
self._validate_color_args()
+ def _setup(self):
+ global _registered
+ if not _registered:
+ from pandas.plotting import _converter
+ _converter.register()
+
+ if _mpl_ge_1_5_0():
+ # Compat with mp 1.5, which uses cycler.
+ import cycler
+ colors = mpl_stylesheet.pop('axes.color_cycle')
+ mpl_stylesheet['axes.prop_cycle'] = cycler.cycler('color', colors)
+
+ _registered = True
+
def _validate_color_args(self):
if 'color' not in self.kwds and 'colors' in self.kwds:
warnings.warn(("'colors' is being deprecated. Please use 'color'"
That covers all the .plot
methods. Would need a decorator or something to cover the plotting methods not attached to NDFrame.
Looks like get_versions
takes up about 25% of the import time for pandas.__init__.py
; That could easily be delayed.
Oh, sorry, I was thinking of show_versions
, not get_versions
. get_versions
would be a bit harder to fix... I did try out https://github.com/pypa/setuptools_scm instead of versioneer, and it worked well. May be worth looking into.
I did a profile with https://github.com/cournape/import-profiler
From a very quick skim:
s3fs
(boto3) also takes a lot of time (140.4 of 786.7 ms). This can maybe be delayed?pytest
? (43 ms) (this is in pandas.util._tester, I think we can easily move the pytest import inside the test
function?)xlsxwriter
(22 ms) probably doesn't need to be imported on pandas import (but didn't look into it, it is imported in the config files)Not huge, but those two would already remove ca 20% of the import time.
However I don't see get_versions
somewhere in there, so not sure how reliable the results are.
Full output:
In [1]: from import_profiler import profile_import
...:
...: with profile_import() as context:
...: # Anything expensive in here
...: import pandas
...:
In [2]: context.print_info()
cumtime (ms) intime (ms) name
786.7 48.4 pandas
196.9 3.3 +numpy
1.8 1.8 ++_globals
1.5 1.5 ++numpy.__config__
1.9 1.9 ++version
1.4 1.3 ++_import_tools
156.9 0 ++
156.9 3.4 +++numpy.add_newdocs
150.9 0.8 ++++numpy.lib
101.6 0.7 +++++type_check
96.6 2.9 ++++++numpy.core.numeric
93.7 1.2 +++++++numpy.core
19 0.1 ++++++++
18.9 18.8 +++++++++numpy.core.multiarray
2.4 0.1 ++++++++
2.3 2.2 +++++++++numpy.core.umath
17.5 0 ++++++++
17.5 2.7 +++++++++numpy.core._internal
2.5 0.6 ++++++++++numpy.compat
1 0 +++++++++++
1 0.9 ++++++++++++numpy.compat._inspect
6.2 3.4 ++++++++++ctypes
1.5 1.5 +++++++++++_ctypes
6 3.4 ++++++++++numerictypes
2.3 2.2 +++++++++++numbers
9 0 ++++++++
8.9 2.2 +++++++++numpy.core.numeric
6.3 1.5 ++++++++++arrayprint
4.7 1 +++++++++++fromnumeric
3.5 0 ++++++++++++
3.4 3.2 +++++++++++++numpy.core._methods
2.1 0 ++++++++
2.1 1.7 +++++++++numpy.core.defchararray
1.3 0 ++++++++
1.2 1.1 +++++++++numpy.core.records
36.4 0.1 ++++++++numpy.testing.nosetester
36.3 0.5 +++++++++numpy.testing
22.6 0.7 ++++++++++unittest
3.2 0.8 +++++++++++result
2.3 0 ++++++++++++
2.2 2.1 +++++++++++++unittest.util
3.7 3.4 +++++++++++case
4.9 4.8 +++++++++++suite
6.4 6.2 +++++++++++loader
3.7 1.1 +++++++++++main
2.5 0 ++++++++++++
2.4 1.4 +++++++++++++unittest.runner
13.1 0 ++++++++++
13.1 2.1 +++++++++++numpy.testing.decorators
10.9 2.2 ++++++++++++utils
5 4.9 +++++++++++++nosetester
3.5 3.3 +++++++++++++numpy.lib.utils
4.2 4.1 ++++++ufunclike
27.3 2.5 +++++index_tricks
15.7 0 ++++++
15.6 11.2 +++++++numpy.lib.function_base
3.9 3.8 ++++++++numpy.lib.twodim_base
6.8 2.7 ++++++numpy.matrixlib
3.9 3.8 +++++++defmatrix
2.2 2.2 ++++++numpy.lib.stride_tricks
1.4 1.3 +++++nanfunctions
7.1 1.4 +++++polynomial
1 1 ++++++numpy.lib.twodim_base
4.5 0.6 ++++++numpy.linalg
3.4 1.1 +++++++linalg
2.1 0 ++++++++numpy.linalg
1.2 1.1 +++++++++numpy.linalg._umath_linalg
4.6 1.5 +++++npyio
1.1 1 ++++++_iotools
2.6 2.5 +++++financial
3.5 0 ++
3.4 0.5 +++numpy.fft
1.6 0.6 ++++fftpack
10 0 ++
9.9 0.6 +++numpy.polynomial
3.4 1.4 ++++polynomial
1.1 1 ++++chebyshev
1 0.9 ++++legendre
1.1 1 ++++hermite
1.3 1.1 ++++hermite_e
1.3 1.1 ++++laguerre
7 0 ++
7 1.2 +++numpy.random
4.7 4.6 ++++mtrand
2.2 0 ++
2.2 2.1 +++numpy.ctypeslib
6.9 0 ++
6.8 0.6 +++numpy.ma
4.7 0 ++++
4.6 4.4 +++++numpy.ma.core
1.5 0 ++++
1.5 1.3 +++++numpy.ma.extras
5.6 1.9 +pytz
1.4 0.7 ++pytz.lazy
25.5 0.8 +pandas.compat.numpy
24.7 1.1 ++pandas.compat
1.9 1.4 +++distutils.version
14.6 2 +++http.client
2 2 ++++http
10.5 4.3 ++++ssl
4 4 +++++ipaddress
1.7 1.7 +++++_ssl
5.3 0 +++dateutil
5.2 1.6 ++++dateutil.parser
2.9 0 +++++
2.9 0.4 ++++++dateutil.tz
2.5 1.5 +++++++tz
26.2 0.4 +pandas._libs
11.8 10.6 ++tslib
14 1.9 ++pandas._libs.hashtable
11.3 6.8 +++pandas._libs.lib
2.7 2.6 ++++_decimal
30.6 1.8 +pandas.core.config_init
4.6 2.3 ++pandas.core.config
2.1 0.5 +++pandas.io.formats.printing
22.6 0.4 ++xlsxwriter
22.2 1 +++workbook
2.5 0.3 ++++compatibility
1.7 1.7 +++++fractions
9.3 5 ++++xlsxwriter.worksheet
3.1 0.6 +++++drawing
1.8 1.8 ++++++shape
4 0.4 ++++xlsxwriter.packager
1.6 0.3 ++++xlsxwriter.chart_area
1.3 0 +++++
1.3 1.2 ++++++xlsxwriter.chart
369.9 0.4 +pandas.core.api
9.9 1 ++pandas.core.algorithms
6 0.6 +++pandas.core.dtypes.cast
5 0.6 ++++common
2.2 0 +++++pandas._libs
2.2 2 ++++++pandas._libs.algos
1.3 1.3 +++++dtypes
2.7 0 +++pandas.core
2.7 0.8 ++++pandas.core.common
1.4 0.2 +++++pandas.api
1.2 0.4 ++++++pandas.api.types
15.4 1.1 ++pandas.core.categorical
13.6 1.4 +++pandas.core.base
2.9 0.3 ++++pandas.util._validators
2.6 0.3 +++++pandas.util
1.7 0.5 ++++++pandas.core.util.hashing
7.6 0.8 ++++pandas.core.nanops
6.7 0.4 +++++bottleneck
1.7 0 ++++++
1.7 0.4 +++++++bottleneck.slow
1 1 ++++++reduce
1 0.5 ++++++bottleneck.benchmark.bench
1.7 0 ++++pandas.compat.numpy
1.7 1.6 +++++pandas.compat.numpy.function
330.5 6.6 ++pandas.core.groupby
81.5 0.3 +++pandas.core.index
81.2 0.6 ++++pandas.core.indexes.api
27.1 2.8 +++++pandas.core.indexes.base
4.9 0 ++++++pandas._libs
2.3 1.8 +++++++pandas._libs.index
2.6 1.9 +++++++pandas._libs.join
16.2 0.9 ++++++pandas.core.ops
15.1 0.6 +++++++pandas.core.computation.expressions
14.5 0.3 ++++++++pandas.core.computation
14.1 0.6 +++++++++numexpr
4.6 4.6 ++++++++++cpuinfo
4.7 1.2 ++++++++++numexpr.expressions
3.5 0 +++++++++++numexpr
3.4 3.3 ++++++++++++numexpr.interpreter
1.4 1 ++++++++++numexpr.necompiler
2.1 0.2 ++++++++++numexpr.tests
1.8 1.7 +++++++++++numexpr.tests.test_numexpr
2.4 2.2 ++++++pandas.core.strings
2.2 2 +++++pandas.core.indexes.category
32.4 32.2 +++++pandas.core.indexes.multi
1.4 1.3 +++++pandas.core.indexes.interval
1.6 1.4 +++++pandas.core.indexes.numeric
1 0.9 +++++pandas.core.indexes.range
11.5 1 +++++pandas.core.indexes.timedeltas
6.3 1.8 ++++++pandas.tseries.frequencies
4.1 2.6 +++++++pandas.tseries.offsets
1.1 0.7 ++++++++pandas.core.tools.datetimes
3.5 1 ++++++pandas.core.indexes.datetimelike
2.2 1.6 +++++++pandas._libs.period
3 1.2 +++++pandas.core.indexes.period
1.6 1.4 ++++++pandas.core.indexes.datetimes
235.6 7.2 +++pandas.core.frame
161.6 3.7 ++++pandas.core.generic
1.3 1.1 +++++pandas.core.indexing
7.1 3.4 +++++pandas.core.internals
3.4 1.1 ++++++pandas.core.sparse.array
1.9 1.7 +++++++pandas._libs.sparse
149.3 1.5 +++++pandas.io.formats.format
147.2 0.7 ++++++pandas.io.common
1.3 0.6 +++++++csv
140.4 0.4 +++++++s3fs
139.5 0.9 ++++++++core
128 0.5 +++++++++boto3
127.5 0.4 ++++++++++boto3.session
116.9 0.7 +++++++++++botocore.session
57.3 0.3 ++++++++++++botocore.configloader
3.1 0 +++++++++++++six.moves
3.1 3 ++++++++++++++configparser
53.8 1.7 +++++++++++++botocore.exceptions
52.1 0 ++++++++++++++botocore.vendored.requests.exceptions
52 0.6 +++++++++++++++botocore.vendored.requests
25.6 0.7 ++++++++++++++++packages.urllib3.contrib
22.7 0.1 +++++++++++++++++botocore.vendored.requests.packages.urllib3
22.6 0.4 ++++++++++++++++++botocore.vendored.requests.packages
22.2 0 +++++++++++++++++++
22.2 0.7 ++++++++++++++++++++botocore.vendored.requests.packages.urllib3
20.2 0.7 +++++++++++++++++++++connectionpool
1.1 1.1 ++++++++++++++++++++++exceptions
3.8 0.4 ++++++++++++++++++++++connection
3.3 0 +++++++++++++++++++++++util.ssl_
3.3 0.2 ++++++++++++++++++++++++botocore.vendored.requests.packages.urllib3.util
1.1 1 +++++++++++++++++++++++++url
11.1 0.3 ++++++++++++++++++++++request
10.8 0.3 +++++++++++++++++++++++filepost
9.8 9.4 ++++++++++++++++++++++++uuid
1.6 0.8 ++++++++++++++++++++++response
1.1 0.9 +++++++++++++++++++++poolmanager
2.2 1.5 +++++++++++++++++botocore.vendored.requests.packages.urllib3.contrib.pyopenssl
20.8 0 ++++++++++++++++
20.8 0.7 +++++++++++++++++botocore.vendored.requests.utils
3.3 0.8 ++++++++++++++++++cgi
2.5 0.7 +++++++++++++++++++html
1.7 1.7 ++++++++++++++++++++html.entities
13.6 0.4 ++++++++++++++++++compat
4 3.2 +++++++++++++++++++urllib.request
5.2 0 +++++++++++++++++++http
5.2 5.1 ++++++++++++++++++++http.cookiejar
3.3 3.3 +++++++++++++++++++http.cookies
1.2 1.1 ++++++++++++++++++cookies
2.3 0.8 ++++++++++++++++models
1.1 0.5 +++++++++++++++++auth
2.4 0.4 ++++++++++++++++api
2.1 0 +++++++++++++++++
2 1.1 ++++++++++++++++++botocore.vendored.requests.sessions
12.4 2.2 ++++++++++++botocore.credentials
8.8 0.8 +++++++++++++botocore.compat
3.3 0 ++++++++++++++botocore.vendored
3.2 3.1 +++++++++++++++botocore.vendored.six
4.2 0.6 ++++++++++++++xml.etree.cElementTree
3 1.1 +++++++++++++++xml.etree.ElementTree
1.2 1.1 +++++++++++++botocore.utils
41.8 1 ++++++++++++botocore.client
29.6 0 +++++++++++++botocore
29.6 1 ++++++++++++++botocore.waiter
9.4 0.8 +++++++++++++++jmespath
8.6 0.1 ++++++++++++++++jmespath
8.6 2.1 +++++++++++++++++jmespath.parser
3.4 0.1 ++++++++++++++++++jmespath
3.3 1.1 +++++++++++++++++++jmespath.lexer
2.2 1.3 ++++++++++++++++++++jmespath.exceptions
2.3 0 ++++++++++++++++++jmespath
2.3 0.8 +++++++++++++++++++jmespath.visitor
1.4 0 ++++++++++++++++++++jmespath
1.4 1.3 +++++++++++++++++++++jmespath.functions
19.2 0.5 +++++++++++++++botocore.docs.docstring
18.6 0.4 ++++++++++++++++botocore.docs
18.1 0.7 +++++++++++++++++botocore.docs.service
1.9 1.8 ++++++++++++++++++botocore.docs.utils
3.3 0.5 ++++++++++++++++++botocore.docs.client
2.1 0.5 +++++++++++++++++++botocore.docs.method
10.9 0.8 ++++++++++++++++++botocore.docs.bcdoc.restdoc
7.3 0.8 +++++++++++++++++++botocore.docs.bcdoc.docstringparser
6.5 4.9 ++++++++++++++++++++html.parser
1.6 1.5 +++++++++++++++++++++_markupbase
2.3 2.2 +++++++++++++++++++botocore.docs.bcdoc.style
1.4 0.9 +++++++++++++botocore.auth
1.2 1 +++++++++++++botocore.awsrequest
1.4 1.3 +++++++++++++botocore.hooks
5.6 0.3 +++++++++++++botocore.args
1.3 0.6 ++++++++++++++botocore.serialize
3.3 0.3 ++++++++++++++botocore.config
3 0.8 +++++++++++++++botocore.endpoint
1.3 0.3 ++++++++++++++++botocore.response
2.6 0 ++++++++++++botocore
2.5 1.4 +++++++++++++botocore.handlers
1 1 +++++++++++boto3.utils
8.5 0.7 +++++++++++resources.factory
6.6 0.4 ++++++++++++action
4.7 0.4 +++++++++++++boto3.docs.docstring
4.2 0.2 ++++++++++++++boto3.docs
4 0.5 +++++++++++++++boto3.docs.service
3.1 0.5 ++++++++++++++++boto3.docs.resource
1.1 0.3 +++++++++++++++++boto3.docs.action
9.9 1.1 +++++++++boto3.s3.transfer
8.4 0.3 ++++++++++concurrent
8.1 0.3 +++++++++++concurrent.futures
1.4 1.3 ++++++++++++concurrent.futures._base
6 0.7 ++++++++++++concurrent.futures.process
2.3 0.4 +++++++++++++multiprocessing
1.9 0 ++++++++++++++
1.8 0.8 +++++++++++++++multiprocessing.context
2.9 1 +++++++++++++multiprocessing.connection
1.3 0 +++++++py.path
1.3 0.8 ++++++++py
3 1.9 +++++++py._path.local
58.5 5.2 ++++pandas.core.series
5.8 0 +++++pandas.core
5.8 3.6 ++++++pandas.core.window
2 1.9 +++++++pandas._libs.window
46.3 0 +++++pandas.plotting._core
46.2 0.6 ++++++pandas.plotting
41 0 +++++++pandas.plotting
40.9 1.3 ++++++++pandas.plotting._converter
23.7 0.5 +++++++++matplotlib.units
23.2 9 ++++++++++matplotlib
2.5 1.5 +++++++++++distutils.sysconfig
1 1 ++++++++++++errors
2.9 2 +++++++++++matplotlib.cbook
7.1 1.2 +++++++++++matplotlib.rcsetup
2.3 2.3 ++++++++++++matplotlib.fontconfig_pattern
2.9 1.5 ++++++++++++matplotlib.colors
1.3 1.3 +++++++++++++_color_data
15.3 1.3 +++++++++matplotlib.dates
1.1 0.9 ++++++++++dateutil.rrule
12.7 1.8 ++++++++++matplotlib.ticker
10.9 0 +++++++++++matplotlib
10.8 8.6 ++++++++++++matplotlib.transforms
1.1 1 +++++++++++++path
1.6 0.5 +++++++pandas.plotting._misc
3 2.8 +++++++pandas.plotting._core
7.8 0.4 ++++pandas.core.computation.eval
6.4 3.9 +++++pandas.core.computation.expr
1.6 0.7 ++++++pandas.core.computation.ops
4.7 4.4 +++pandas.core.panel
1.3 0 +++pandas._libs
1.3 1.2 ++++pandas._libs.groupby
2.1 1.8 ++pandas.core.panel4d
8.9 0.9 ++pandas.core.reshape.reshape
7 0.3 +++pandas.core.sparse.api
3.2 2.2 ++++pandas.core.sparse.series
3 2.9 ++++pandas.core.sparse.frame
1.8 1.6 ++pandas.core.resample
1.6 0.4 +pandas.stats.api
1 1 ++pandas.stats.moments
2.8 0.2 +pandas.core.reshape.api
1.1 0.8 ++pandas.core.reshape.merge
29 0.3 +pandas.io.api
5.9 2.7 ++pandas.io.parsers
2.6 2 +++pandas._libs.parsers
3 1.5 ++pandas.io.excel
1.1 1 +++pandas._libs.json
5.8 3.5 ++pandas.io.pytables
2 1.9 +++pandas.core.computation.pytables
2.4 0.5 ++pandas.io.json
1.8 1 +++json
2 1.8 ++pandas.io.stata
5.3 0.7 ++pandas.io.packers
3.8 1.1 +++pandas.io.msgpack
43.9 0.2 +pandas.util._tester
43.7 6.1 ++pytest
8.8 1.3 +++_pytest.config
2.9 0.3 ++++_pytest._code
2 1.2 +++++code
2 0.4 ++++_pytest.hookspec
1.5 0.2 +++++_pytest._pluggy
1.3 1.1 ++++++_pytest.vendored_packages.pluggy
2.3 0.5 ++++_pytest.assertion
1.1 0 +++++_pytest.assertion
1.1 1 ++++++_pytest.assertion.rewrite
1.9 0.9 +++_pytest.main
5.7 1.3 +++_pytest.python
4.3 0 ++++_pytest
4.3 1.8 +++++_pytest.fixtures
1.8 1.1 ++++++py._code.code
1.2 0.5 +++_pytest.unittest
2.5 0.9 +++_pytest.capture
1.4 0.7 ++++py._io.capture
1 0.4 +++_pytest.tmpdir
10.1 9.1 +++_pytest.junitxml
3.3 0.3 +pandas.testing
3 1.7 ++pandas.util.testing
1.1 0 +++pandas._libs
1.1 0.9 ++++pandas._libs.testing
Also see #7282, but seems like already more attention here.
Also see #7282, but seems like already more attention here.
It's a bit different issue, this is in general about reducing import time, the other issue is about a specific case where the import takes many seconds (but also numpy takes seconds to import, so IMO it's not pandas specific issue)
I wouldn't normally be concerned about this, as of it course it only happens once, but our import time has gotten quite long, to the point I notice it hanging my
ipython
startup.I don't have a good sense of what would be required to improve this, probably deferring more imports to be just in time?
on
0.20.2
- each import in a separate processBelow is a single process, importing deps first