Open eamonnfaherty opened 3 years ago
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If closed, you may revisit when your time allows and reopen! Thank you for your contributions.
I have built a workflow that just calls APIs and does very little processing with tiny json documents. The workflow runs in waves where one wave cannot run until all previous waves have completed. There are around 10 waves and each wave comprises of around 1400 tasks.
Some of tasks in each wave depend on tasks that run at the beginning before even the first wave runs.
This all runs in a local scheduler executed in a controlled environment.
I am seeing very high CPU and high memory usage.
BTW, I am seeing higher CPU and memory usage in the latest version.
I looked into the code before when this was a problem and I noticed the workers are forked processes which appeared to be causing a large spike.
I noticed switching over to the shared scheduler eased this problem significantly.
Are there any strategies or guidance for keeping memory usage low? For example should I be using static dependencies in requires or would dynamic dependencies through yielding in the run be better?