This issue will be used to track the discussion on memory related issues of a python process (in the context of distributed graph partitioning pipeline).
The goal of this effort is to effectively handle very large no. of nodes/edges per graph partition on each ec2 instance during normal pipeline's processing.
Currently we use "/proc/[pid]/status" to print memory usage of pipeline during the course of its execution. We take a snapshot of the following items:
VmPeak - Peak memory used, so far, during the course of the execution of the pipeline
VmSize - Total size of the address space used by the process
VmRSS - Size of the memory which is in the RAM at the moment
VmData - Size of the Data portion of the current process
IMPORTANT:
This template is only for dev team to track project progress. For feature request or bug report, please use the corresponding issue templates.
DO NOT create a new work item if the purpose is to fix an existing issue or feature request. We will directly use the issue in the project tracker.
🔨Work Item
This issue will be used to track the discussion on memory related issues of a python process (in the context of distributed graph partitioning pipeline).
The goal of this effort is to effectively handle very large no. of nodes/edges per graph partition on each ec2 instance during normal pipeline's processing.
Currently we use "/proc/[pid]/status" to print memory usage of pipeline during the course of its execution. We take a snapshot of the following items:
VmPeak - Peak memory used, so far, during the course of the execution of the pipeline VmSize - Total size of the address space used by the process VmRSS - Size of the memory which is in the RAM at the moment VmData - Size of the Data portion of the current process
IMPORTANT:
Project tracker: https://github.com/orgs/dmlc/projects/2
Description
Depending work items or issues