epam / cloud-pipeline

Cloud agnostic genomics analysis, scientific computation and storage platform
https://cloud-pipeline.com
Apache License 2.0
145 stars 59 forks source link

Clusters - enable GPU scaling #2555

Open sidoruka opened 2 years ago

sidoruka commented 2 years ago

Introduce GUI support for #1521

  1. Add "Enable GPU scaling" (below "Enable Hybrid cluster")
  2. Help icon - text TBD
  3. Add ge.autoscaling.scale.multi.queues.template preference
    • InstanceTypeCPU - allow only non-GPU instances
    • InstanceTypeGPU - allow only GPU instances
    • No or is empty: do no show "Enable GPU scaling checkbox"
    • No Hybrid/General or is empty: do no show "Enable GPU scaling checkbox"
    • No FamilyTypeCPU/FamilyTypeGPU/InstanceTypeCPU/InstanceTypeGPU: do not show respective contols (DDLs or TBs)
{
    '<cloud>': {
        {
            'Hybrid': {
                'FamilyTypeCPU': {
                    'Param': 'CP_CAP_AUTOSCALE_HYBRID_FAMILY_1',
                    'DefaultValue': 'm5'
                },
                'FamilyTypeGPU': {
                    'Param': 'CP_CAP_AUTOSCALE_HYBRID_FAMILY_2',
                    'DefaultValue': 'p2'
                },
                'CP_CAP_AUTOSCALE_1': 'true',
                'CP_CAP_AUTOSCALE_2': 'true',
                'CP_CAP_SGE_QUEUE_NAME_1': 'cpu.q',
                'CP_CAP_SGE_QUEUE_NAME_2': 'gpu.q'
            },
            'General': {
                'InstanceTypeCPU': {
                    'Param': 'CP_CAP_AUTOSCALE_INSTANCE_TYPE_1',
                    'DefaultValue': 'm5.large'
                },
                'InstanceTypeGPU': {
                    'Param': 'CP_CAP_AUTOSCALE_INSTANCE_TYPE_2',
                    'DefaultValue': 'p2.xlarge'
                },
                'CP_CAP_AUTOSCALE_1': 'true',
                'CP_CAP_AUTOSCALE_2': 'true',
                'CP_CAP_SGE_QUEUE_NAME_1': 'cpu.q',
                'CP_CAP_SGE_QUEUE_NAME_2': 'gpu.q'
            }
        }
    }
}
rodichenko commented 2 years ago

@sidoruka GUI part implemented (9e9ea43daf565b93361647ba2636d4c71f6aecf6), backported to release/0.16 (389fbf5783c280cdfdd662ed646a7e20b379802a)

tcibinan commented 2 years ago

Same for the backend changes. Cherry-picked to release/0.16 via e491b767158004fdd8b2f86a9f0d616237be6a8f.