black performance scales non-linearly with the number of files to process

asottile commented 3 years ago

Describe the bug A clear and concise description of what the bug is.

To Reproduce Steps to reproduce the behavior:

I suspect this is related to #1950 -- or at least exacerbated by it

setup

rm -rf src
mkdir src
touch src/test_{1..10000}.py

1000 files

$ rm -rf t; time XDG_CACHE_HOME=$PWD/t black src/test_{1..1000}.py
All done! ✨ 🍰 ✨
1000 files left unchanged.

real    0m1.504s
user    0m1.467s
sys 0m0.195s

10000 files

$ rm -rf t; time XDG_CACHE_HOME=$PWD/t black src/test_{1..10000}.py
All done! ✨ 🍰 ✨
10000 files left unchanged.

real    0m55.354s
user    0m54.547s
sys 0m2.550s

I expect the second command to be approximately 10x slower than the first, however it's almost 30-40x worse

A guess is black is writing the cache for every single file and should instead batch it per run

Expected behavior A clear and concise description of what you expected to happen.

Environment (please complete the following information):

Version: [e.g. master] 71117e730c4f62458b30af820f51890487b458e4
OS and Python version: [e.g. Linux/Python 3.7.4rc1] 3.8.5 on ubuntu 20.04

Does this bug also happen on master? To answer this, you have two options:

yes

Additional context Add any other context about the problem here.

JelleZijlstra commented 3 years ago

Good find! From looking at the code it appears we shouldn't write to the cache multiple times per run, so there must be some other superlinear behavior going on. Does it reproduce with your patch for #1950?

asottile commented 3 years ago

oh yeah I should have checked, yeah it does so my guess at the cache being at fault is probably incorrect

zsol commented 3 years ago

I had some spare time and after some profiling it looks like ensure_future is being called 10x more for 3x the amount of input. The vast majority of the calls are coming from the implementation of asyncio.wait because we're spamming it in a tight loop from https://github.com/psf/black/blob/main/src/black/__init__.py#L642

zsol commented 3 years ago

In fact, adding an await asyncio.sleep(0.1) at the end of that loop fixes this issue and makes performance scale linearly with number of input files

zsol commented 3 years ago

I wonder if we should just use asyncio.as_completed instead of the hand-rolled batching/waiting logic in https://github.com/psf/black/blob/main/src/black/__init__.py#L626-L659. Or maybe that would result in too much awaiting?

If so, we could bring out the big guns and use https://aioitertools.omnilib.dev/en/latest/api.html#aioitertools.asyncio.as_completed and https://aioitertools.omnilib.dev/en/latest/api.html#aioitertools.more_itertools.chunked

psf / black

black performance scales non-linearly with the number of files to process #1951