simplistix / configurator

A Python library for handling configuration files.
https://configurator.readthedocs.io
MIT License
39 stars 6 forks source link

Improve start-up time by deferring slow imports. #6

Closed kalekundert closed 4 years ago

kalekundert commented 4 years ago

The configurator package takes 200-300 ms to import, which is enough to cause a noticeable pause when a program is starting up. Almost all of this time is spent importing pkg_resources:

$ python -X importtime -c 'import configurator'
$ python -X importtime -c 'import pkg_resources'

By deferring this import until it is needed, configurator can start up in about 7 ms—much faster. This is really nice if the command being executed doesn't actually need to read a config file, e.g. if it's just displaying usage help or something.

kalekundert commented 4 years ago

I added another commit to defer a few more imports, specifically copy and pprint. This is only a small improvement: it brings this import time down from ~7 ms to ~2 ms. But both modules are only used in one place, so it's not disruptive to defer them. pprint in particular is fairly unlikely to actually be needed, since it's only used in ConfigNode.__repr__(), which is more of a debugging tool.

I don't think it's worth doing anything more thorough than this, since there's really not much time left to save. Go ahead and merge if everything looks good to you.

Thanks for writing this library, by the way. I just found it a few weeks ago. In the past I'd write my own half-baked code every time I need to merge config files, but this is much nicer.

cjw296 commented 4 years ago

I have to be honest, I'm having second thoughts about this PR. I'm really not a fan local imports, particularly those in a9f8e04. I'm also curious about the actual win here: if a Configurator is instantiated, then you're going to have to pay the cost of these imports, unless I'm missing something? I would hope that use of whatever you're building is more more than the number of times --help is run?

How about having the import of configurator in your library be moved to where it's instantiated? Would that give you the same performance win?

Apologies for all the questions, trying to get to the right solution :-)

kalekundert commented 4 years ago

No worries. My application is a command line tool with a number of different subcommands, many of which don't read any configuration files. I was trying to make the commands a little more snappy, so I profiled the imports and noticed that almost all of the import time was devoted to configurator, entirely because it imports pkg_resources. Obviously this is a waste of time for the commands that don't read the config files. But even for the commands that do, there's a subtle problem: if I started the command, then immediately decided to Ctrl-C it for some reason (e.g. I noticed an argument was wrong), I would often get a configurator/pkg_resources stack trace despite the fact that my main function has a nice handler for keyboard interrupts.

None of this is a big problem for me; as you point out, I can import configurator locally in my own project so that only the subcommands that need it pay the price. This is a bit of a gotcha, though, which is why I thought it'd be better to address in configurator itself.

Feel free to revert a9f8e04 (or I can do it, if that would be easier). I was just trying to address your comment about loading each library when it's used, but the performance improvement is negligible. I do think that pprint probably should be a local import, but it doesn't really matter.

cjw296 commented 4 years ago

I'm starting to think that allowing parsers to be provided by setuptools entrypoints might just be a bad idea :-/

cjw296 commented 4 years ago

Can you test 2.0.0? My hope is this will have much better performance, even when you use it!

kalekundert commented 4 years ago

2.0.0 seems much faster. Thanks for taking the time to work on this!