Closed sarahmonod closed 5 months ago
@gusmonod I am gonna pick this up adding some docs here in the issue; then we can think of the best place to commit them
When a process runs out of memory, it is common for it to be killed by its operating system. Within orchestrations like kubernetes the termination of the container might cause the loss of the files that memray uses to collect it's results. More specifically, when running memray run myprogram.py
the memray Tracer would create a capture file on the file system but the file system will be thrown away as soon as the orchestration cleans up the container.
This condition might be particularly common as memray is often used to chase memory leaks.
Memray itself cannot post-process the file (sending it over the network for example) because any work it does will be interrupted when the process crashes.
In place of calling a memray run myprogram.py
we are going to call a script that will run memray handling it's OOM termination and running postprocessing operations on the capture file.
memray run --output /tmp/capture.bin myprogram.py
echo "Program finished"
# do your post processing here
memray summary /tmp/capture.bin
Addressed by #605
Is there an existing proposal for this?
Is your feature request related to a problem?
When a process runs out of memory in a container, it is common for it to be killed by its orchestration, e.g. with Kubernetes. This means that running
memray run ...
would create a file on the file system, which will be immediately thrown away as soon as the process crashes with a OOM (Out Of Memory) error.Memray itself cannot post-process the file (sending it over the network for example) because any work it does will be interrupted when the process crashes. So even though we already have a part of our code that compresses the capture file after it's finished writing, memray couldn't call that code because the process would be interrupted already.
Describe the solution you'd like
Adding documentation on how to wrap
memray run ...
with a shell script to (for example) compress and send the capture file over the network for later analysis would be helpful, given that we have at least two different people who have come to us with this problem already.Alternatives you considered
No response