Description:
Running inference with the enable_workflow (Ray Workflow) option causes all processes to be pinned to a single core.
Steps to Reproduce:
Follow the conda or container installation instructions and run inference with the --enable_workflow option
Expected Behavior:
It is expected that the workload would be spread across the resources available in the Ray Cluster, i.e. processes should run on difference cores
Actual Behavior:
All processes and Ray workers have the same cpu/core affinity:
In top, notice the P column is all 0 for FastFold processes
Description: Running inference with the enable_workflow (Ray Workflow) option causes all processes to be pinned to a single core.
Steps to Reproduce: Follow the conda or container installation instructions and run inference with the --enable_workflow option
Expected Behavior: It is expected that the workload would be spread across the resources available in the Ray Cluster, i.e. processes should run on difference cores
Actual Behavior: All processes and Ray workers have the same cpu/core affinity:
In top, notice the P column is all 0 for FastFold processes
This is also confirmed with taskset (output is truncated)
Environment:
Steps Taken to Resolve: This seems to be a torch issue see: ray-project/ray/issues/34201 and pytorch/pytorch/issues/99625 . One fix is to set KMP_AFFINITY to disabled before running inference: