bloomberg / memray

Memray is a memory profiler for Python
https://bloomberg.github.io/memray/
Apache License 2.0
13.36k stars 397 forks source link

%%memray_flamegraph magic options #519

Closed lcrmorin closed 6 months ago

lcrmorin commented 10 months ago

Is there an existing proposal for this?

Is your feature request related to a problem?

I am trying to personnalise memray for more advanced usage. With the %%memray magic I can't (1) specify ouptut path (--output option) and (2) cannot specify I am mainly interested in the plots (--temporal ?).

Describe the solution you'd like

I'd like the %%memray magic to have the same options as the command line tool.

Alternatives you considered

No response

godlygeek commented 10 months ago

I'd happily take a PR that adds --temporal support. %%memray_flamegraph has its own copy of the argument parsing from src/memray/commands/common.py, so this probably would just require copy-pasting the code for handling temporal mode from there into src/memray/_ipython/flamegraph.py, and seeing if we can add a test for it in tests/integration/test_ipython.py.

--output-path might be a little bit trickier, though. Currently %%memray_flamegraph is designed under the assumption that you want to display the generated flamegraph inside the Jupyter notebook. Are you saying that you'd like to have an option where instead it writes that generated flame graph as a .html file on disk?

lcrmorin commented 10 months ago

Thanks, I'll look into this. For path maybe the way to go is to allow for customizing task id, now I get random strings (cell id maybe ?). Now I have another problem: I ran code overnight and memray seems to have bloated the disk. Any option to reduce memory size ? (typically sampling frequency ?) Best,

godlygeek commented 10 months ago

Memray is a tracing profiler, not a sampling profiler, so there's no concept of "sampling frequency" that you can apply. memray run itself supports a --aggregate mode that performs in-memory aggregation, resulting in higher memory usage and lower disk usage, but it's incompatible with the --temporal report you said you're interested in. Temporal reports inherently need point-in-time data about what is allocated at each moment in time, while the aggregated capture file format only retains information about what's on the heap at the high water mark and at the point when tracking is terminated.

In any event, it doesn't look like we added support for --aggregate to %%memray_flamegraph. That's an oversight - it probably should be the default. Maybe it shouldn't even be possible to turn it off... as things stand today, %%memray_flamegraph doesn't retain the capture file, only the flamegraph, and it can generate the flamegraph, both in --leaks mode and the default high water mark mode, with the smaller --aggregate capture file.

godlygeek commented 6 months ago

520 and #538 addressed the feature gap between the %%memray_flamegraph IPython magic and the memray flamegraph CLI, and helped to reduce the amount of disk space used by the magic.