inducer / pytential

Evaluate layer and volume potentials accurately. Solve integral equations.
https://pypi.python.org/pypi/pytential
25 stars 14 forks source link

Try out pytest-memray #218

Closed alexfikl closed 1 year ago

alexfikl commented 1 year ago

Run memray on the CI to see a bit what's going on there. Not meant to be merged!

alexfikl commented 1 year ago

Running test_linalg_skeletonization (locally) showed the three largest allocators to be

Allocations results for test/test_linalg_skeletonization.py::test_skeletonize_by_proxy[<PyOpenCLArrayContext for <pyopencl.Device 'cpu-12th Gen Intel(R) Core(TM) i7-1255U' on 'Portable Computing Language'>>-case0]

📦 Total memory allocated: 1.0GiB
📏 Total allocations: 2136804
📊 Histogram of allocation sizes: |█ ▅ ▁    |
🥇 Biggest allocating functions:
- map_int_g:/mnt/data/code/projects/inducer/pytential/pytential/symbolic/matrix.py:438 -> 275.2MiB
- _get:/mnt/data/code/projects/inducer/pyopencl/pyopencl/array.py:855 -> 274.7MiB
- eye:/usr/lib/python3.11/site-packages/numpy/lib/twodim_base.py:211 -> 274.7MiB
- _create_built_program_from_source_cached:/mnt/data/code/projects/inducer/pyopencl/pyopencl/cache.py:440 -> 6.2MiB
- _create_built_program_from_source_cached:/mnt/data/code/projects/inducer/pyopencl/pyopencl/cache.py:440 -> 2.2MiB

Allocations results for test/test_linalg_skeletonization.py::test_skeletonize_symbolic[<PyOpenCLArrayContext for <pyopencl.Device 'cpu-12th Gen Intel(R) Core(TM) i7-1255U' on 'Portable Computing Language'>>-case2]

📦 Total memory allocated: 614.5MiB
📏 Total allocations: 6807634
📊 Histogram of allocation sizes: |█ ▃      |
🥇 Biggest allocating functions:
- _create_built_program_from_source_cached:/mnt/data/code/projects/inducer/pyopencl/pyopencl/cache.py:422 -> 258.4MiB
- program_build:/mnt/data/code/projects/inducer/pyopencl/pyopencl/__init__.py:737 -> 109.2MiB
- _create_built_program_from_source_cached:/mnt/data/code/projects/inducer/pyopencl/pyopencl/cache.py:440 -> 4.1MiB
- _create_built_program_from_source_cached:/mnt/data/code/projects/inducer/pyopencl/pyopencl/cache.py:422 -> 2.5MiB
- _create_built_program_from_source_cached:/mnt/data/code/projects/inducer/pyopencl/pyopencl/cache.py:440 -> 2.2MiB

Allocations results for test/test_linalg_skeletonization.py::test_skeletonize_symbolic[<PyOpenCLArrayContext for <pyopencl.Device 'cpu-12th Gen Intel(R) Core(TM) i7-1255U' on 'Portable Computing Language'>>-case3]

📦 Total memory allocated: 613.9MiB
📏 Total allocations: 6785701
📊 Histogram of allocation sizes: |█ ▃      |
🥇 Biggest allocating functions:
- _create_built_program_from_source_cached:/mnt/data/code/projects/inducer/pyopencl/pyopencl/cache.py:422 -> 259.7MiB
- program_build:/mnt/data/code/projects/inducer/pyopencl/pyopencl/__init__.py:737 -> 111.1MiB
- _create_built_program_from_source_cached:/mnt/data/code/projects/inducer/pyopencl/pyopencl/cache.py:440 -> 4.1MiB
- _create_built_program_from_source_cached:/mnt/data/code/projects/inducer/pyopencl/pyopencl/cache.py:440 -> 2.2MiB
- _create_built_program_from_source_cached:/mnt/data/code/projects/inducer/pyopencl/pyopencl/cache.py:440 -> 2.2MiB
alexfikl commented 1 year ago

https://github.com/inducer/pytential/actions/runs/5707169876/job/15463614259?pr=218

The CI reports some other stuff as the biggest offenders:

Allocations results for test/test_scalar_int_eq.py::test_integral_equation[<PyOpenCLArrayContext for <pyopencl.Device 'cpu-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz' on 'Portable Computing Language'>>-case9]

     📦 Total memory allocated: 6.2GiB
     📏 Total allocations: 99038444
     📊 Histogram of allocation sizes: |▅ █ ▁    |
     🥇 Biggest allocating functions:
        - __init__:/home/runner/work/pytential/pytential/.conda-root/envs/testing/lib/python3.11/site-packages/pyvkfft/opencl.py:117 -> 1.4GiB
        - _create_built_program_from_source_cached:/home/runner/work/pytential/pytential/.conda-root/envs/testing/lib/python3.11/site-packages/pyopencl/cache.py:440 -> 646.5MiB
        - __init__:/home/runner/work/pytential/pytential/.conda-root/envs/testing/lib/python3.11/site-packages/pyvkfft/opencl.py:117 -> 617.8MiB
        - __init__:/home/runner/work/pytential/pytential/.conda-root/envs/testing/lib/python3.11/site-packages/pyvkfft/opencl.py:117 -> 439.5MiB
        - _create_built_program_from_source_cached:/home/runner/work/pytential/pytential/.conda-root/envs/testing/lib/python3.11/site-packages/pyopencl/cache.py:440 -> 390.4MiB

Allocations results for test/test_layer_pot_eigenvalues.py::test_ellipse_eigenvalues[<PyOpenCLArrayContext for <pyopencl.Device 'cpu-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz' on 'Portable Computing Language'>>-2-7-5-False]

     📦 Total memory allocated: 3.3GiB
     📏 Total allocations: 15594623
     📊 Histogram of allocation sizes: |▂ █ ▂ ▁  |
     🥇 Biggest allocating functions:
        - _create_built_program_from_source_cached:/home/runner/work/pytential/pytential/.conda-root/envs/testing/lib/python3.11/site-packages/pyopencl/cache.py:440 -> 456.4MiB
        - _create_built_program_from_source_cached:/home/runner/work/pytential/pytential/.conda-root/envs/testing/lib/python3.11/site-packages/pyopencl/cache.py:440 -> 392.2MiB
        - _create_built_program_from_source_cached:/home/runner/work/pytential/pytential/.conda-root/envs/testing/lib/python3.11/site-packages/pyopencl/cache.py:440 -> 328.3MiB
        - __init__:/home/runner/work/pytential/pytential/.conda-root/envs/testing/lib/python3.11/site-packages/pyvkfft/opencl.py:117 -> 269.4MiB
        - _create_built_program_from_source_cached:/home/runner/work/pytential/pytential/.conda-root/envs/testing/lib/python3.11/site-packages/pyopencl/cache.py:440 -> 264.3MiB

Allocations results for test/test_layer_pot_eigenvalues.py::test_ellipse_eigenvalues[<PyOpenCLArrayContext for <pyopencl.Device 'cpu-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz' on 'Portable Computing Language'>>-2-5-3-False]

     📦 Total memory allocated: 2.9GiB
     📏 Total allocations: 7504599
     📊 Histogram of allocation sizes: |▂ █ ▃ ▁  |
     🥇 Biggest allocating functions:
        - __init__:/home/runner/work/pytential/pytential/.conda-root/envs/testing/lib/python3.11/site-packages/pyvkfft/opencl.py:117 -> 538.5MiB
        - __init__:/home/runner/work/pytential/pytential/.conda-root/envs/testing/lib/python3.11/site-packages/pyvkfft/opencl.py:117 -> 474.5MiB
        - _create_built_program_from_source_cached:/home/runner/work/pytential/pytential/.conda-root/envs/testing/lib/python3.11/site-packages/pyopencl/cache.py:440 -> 328.5MiB
        - __init__:/home/runner/work/pytential/pytential/.conda-root/envs/testing/lib/python3.11/site-packages/pyvkfft/opencl.py:117 -> 90.5MiB
        - __init__:/home/runner/work/pytential/pytential/.conda-root/envs/testing/lib/python3.11/site-packages/pyvkfft/opencl.py:117 -> 90.5MiB

Not quite sure what to believe :(

alexfikl commented 1 year ago

https://github.com/inducer/pytential/actions/runs/5714290658/job/15481331710?pr=218

Ran this with the attrs branch of pymbolic and CISUPPORT_PARALLEL_PYTEST=no and it failed at the test_scalar_int_eq as prophesied by https://github.com/inducer/pytential/pull/218#issuecomment-1657233095

test_beltrami.py ....                                                    [  1%]
test_cost_model.py ................                                      [  8%]
test_global_qbx.py ........                                              [ 12%]
test_layer_pot.py ..........                                             [ 16%]
test_layer_pot_eigenvalues.py .........                                  [ 20%]
test_layer_pot_identity.py ..
Error: The operation was canceled.

@inducer So current guess.. the test_scalar_int_eq test goes from 6.2GB on main to something larger on attrs and crashes the CI? Does that match anything you've been seeing?