Closed nkaretnikov closed 1 month ago
Assigning to @peytondmurray since he is looking into perf
I spent some time profiling environment creation - that's sending a POST request to /specification/
. For my profiling, I used the following environment:
channels:
- conda-forge
- bokeh
dependencies:
- python=3.10
- panel
- ipykernel
- ipywidgets
- ipywidgets_bokeh
- holoviews
- openjdk=17.0.9
- pyspark
- findspark
- jhsingle-native-proxy>=0.8.2
- bokeh-root-cmd>=0.1.2
- nbconvert
- pip:
- nrtk==0.3.0
- xaitk-saliency==0.7.0
- maite==0.5.0
- daml==0.44.5
- hypothesis >=6.61.0,<7.0.0
- pytest >=7.2.0,<8.0
- pytest-cov >=4.0.0,<5.0
- pytest-mock >= 3.10.0,<4.0
- pytest-snapshot >= 0.9.0
- pytest-xdist >=3.3.1,<4.0.0
- types-python-dateutil >=2.8.19,<3.0.0
- tox >=4.6.4,<5.0.0
- virtualenv-pyenv >=0.3.0,<1.0.0
- jupytext >= 1.14.0
- numpydoc >= 1.5.0
- pyright >= 1.1.280
- loguru
- torch>=2.1
- torchmetrics
- torchvision
- multiprocess
- keras
- yolov5
Using conda-store
via docker compose up --build
, the build took 27 minutes to finish on my local machine. For comparison, installing via conda
directly (not through conda-store) took ~5 minutes, and running conda-lock
took ~10 minutes.
Using pyinstrument
I was able to get both server and worker profiles. The server spent almost no time at all dispatching the worker to build the environment. There's no bottleneck here.
The worker on the other hand has two major problematic parts:
conda-lock
. This is probably because there are two solves happening here: once for the conda packages, and once again using the vendored version of poetry for the pip packages. If we want to go faster, we'll need to eliminate this call to conda-lock
.These findings have informed future efforts that will be spent on conda-store
, in addition to conda
itself.
Closing as completed now that we have planned action items that will address performance issues.
Feature description
See this talk from PackagingCon 2023: https://cfp.packaging-con.org/2023/talk/VZUZ9Y/ Gotta Go Fast Kat Marchán
Slides: https://github.com/zkat/presentations/blob/881f8d085d30b1d6c89b666606570fa1a8d1a99f/presentation.md
Value and/or benefit
The talk is packed with tips on improving package manager perf. We need to look at the slides and see what we can improve in conda-store.
Anything else?
Just an idea to explore, not an immediate call to action.