pypa / pip

The Python package installer
https://pip.pypa.io/
MIT License
9.52k stars 3.02k forks source link

Pip memory usage for large cached install dominated by list of candiate pages #12834

Open notatallshaw opened 3 months ago

notatallshaw commented 3 months ago

Description

I don't know if anything can be done about this, but using memray do to a memory analysis of pip for a large dry install (apache-airflow[all]==2.9.2) there is a peak memory usage of 354 MBs and 250 MBs of that is a list of pages.

Expected behavior

This seems like far too much memory, like the whole page contents is being kept in memory where a smaller representation of the page needs to be kept?

pip version

24.1.1

Python version

3.12

OS

Linux

How to Reproduce

  1. Create and activate virtual environment
  2. python -m pip install memray
  3. python -m pip install --dry-run "apache-airflow[all]==2.9.2" (to fill cache)
  4. python -m memray run -m pip install --dry-run "apache-airflow[all]==2.9.2"
  5. python -m memray flamegraph memray-pip.*.bin

Output

image

Code of Conduct

notatallshaw commented 3 months ago

Running with no cache the peak memory usage jumps to 529 MBs, and there is a second stack using memory created from an mmap call in cache control. I think that isn't as significant as the OS determines if mmap goes into memory, right?

image