Open Cloudac7 opened 4 days ago
请您方便的话,通过scontrol show partition
提供与上述qos
相关的分区配置下的qos
信息;
以及通过sacctmgr show qos format=Name,Partition
提供各qos的分区配置信息,
来帮助我们进一步排查上述问题
请您方便的话,通过
scontrol show partition
提供与上述qos
相关的分区配置下的qos
信息; 以及通过sacctmgr show qos format=Name,Partition
提供各qos的分区配置信息, 来帮助我们进一步排查上述问题
scontrol show partition
输出如下:
# scontrol show partition
PartitionName=cpu
AllowGroups=ALL AllowAccounts=ai4ecaig,ai4ecailoc,ai4ecall,ai4ecccg,ai4ecctmig,ai4ececg,ai4eceeg,ai4ecepg,ai4ecmig,ai4ecnimte,baoljgroup,bnulizdgroup,brengroup,caogroup,caoshgroup,caoxrgroup,caoxygroup,caozxgroup,cfdai,chenggroup,chengjungroup,chenhygroup,chenlingroup,chxgroup,cpddai,csygroup,dengxianming,dicpyuliang,dpikkem,duanamgroup,dwzhougroup,fangngroup,fengmingbaogroup,gonglgroup,gxpgroup,hciscgroup,houxugroup,hthiumtest,huanghlgroup,huangjlgroup,huangqlgroup,huangweigroup,huangwengroup,hujungroup,hwjgroup,jfligroup,jinyugroup,kechgroup,lichgroup,lijinggroup,lintianweigroup,liswgroup,liugkgroup,liuhygroup,liyegroup,luoyuanronggroup,luweihuagroup,lvtygroup,maruigroup,maslgroup,mengcgroup,mslgroup,nfanggroup,nfzhenggroup,pavlogroup,qgzhanggroup,qikaigroup,rjxiegroup,shuaiwanggroup,songkaixingroup,sungroup,test,test1,test2,tianxingwugroup,tianygroup,tuzhangroup,txionggroup,ustbhushuxian,wangcgroup,wangjgroup,wangslgroup,wangtinggroup,wbjgroup,wcgroup,wenyhgroup,wucxgroup,wusqgroup,xinlugroup,xmuchemcamp,xmuewccgroup,xmuldk,xuehuijiegroup,yigroup,yijgroup,yixiaodonggroup,youycgroup,yuhrgroup,yushilingroup,ywjianggroup,zenghuabingroup,zhandpgroup,zhanghcgroup,zhangqianggroup,zhangyygroup,zhangzengkaigroup,zhangzhgroup,zhangzhongnangroup,zhaohonggroup,zhaoyungroup,zhengqianggroup,zhengxhgroup,zhouweigroup,zhujungroup,zhuzzgroup,zlonggroup,zpmaogroup,ai4ecxmri,ai4ecgeely,zhanghuiminggroup,ai4ec,ai4ecspectr,sunyfgroup,lishaobingroup,huangkgroup,fugroup AllowQos=normal,long
AllocNodes=ALL Default=NO QoS=N/A
DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=0 LLN=NO MaxCPUsPerNode=UNLIMITED
Nodes=cu[001-389]
PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO
OverTimeLimit=NONE PreemptMode=OFF
State=UP TotalCPUs=24896 TotalNodes=389 SelectTypeParameters=NONE
JobDefaults=(null)
DefMemPerCPU=1024 MaxMemPerCPU=4096
PartitionName=gpu
AllowGroups=ALL AllowAccounts=ai4ecaig,ai4ecailoc,ai4ecall,ai4ecccg,ai4ecctmig,ai4ececg,ai4eceeg,ai4ecepg,ai4ecmig,ai4ecnimte,baoljgroup,bnulizdgroup,brengroup,caogroup,caoshgroup,caoxrgroup,caoxygroup,caozxgroup,cfdai,chenggroup,chengjungroup,chenhygroup,chenlingroup,chxgroup,cpddai,csygroup,dengxianming,dicpyuliang,dpikkem,duanamgroup,dwzhougroup,fangngroup,fengmingbaogroup,gonglgroup,gxpgroup,hciscgroup,houxugroup,hthiumtest,huanghlgroup,huangjlgroup,huangqlgroup,huangweigroup,huangwengroup,hujungroup,hwjgroup,jfligroup,jinyugroup,kechgroup,lichgroup,lijinggroup,lintianweigroup,liswgroup,liugkgroup,liuhygroup,liyegroup,luoyuanronggroup,luweihuagroup,lvtygroup,maruigroup,maslgroup,mengcgroup,mslgroup,nfanggroup,nfzhenggroup,pavlogroup,qgzhanggroup,qikaigroup,rjxiegroup,shuaiwanggroup,songkaixingroup,sungroup,test,test1,test2,tianxingwugroup,tianygroup,tuzhangroup,txionggroup,ustbhushuxian,wangcgroup,wangjgroup,wangslgroup,wangtinggroup,wbjgroup,wcgroup,wenyhgroup,wucxgroup,wusqgroup,xinlugroup,xmuchemcamp,xmuewccgroup,xmuldk,xuehuijiegroup,yigroup,yijgroup,yixiaodonggroup,youycgroup,yuhrgroup,yushilingroup,ywjianggroup,zenghuabingroup,zhandpgroup,zhanghcgroup,zhangqianggroup,zhangyygroup,zhangzengkaigroup,zhangzhgroup,zhangzhongnangroup,zhaohonggroup,zhaoyungroup,zhengqianggroup,zhengxhgroup,zhouweigroup,zhujungroup,zhuzzgroup,zlonggroup,zpmaogroup,ai4ecxmri,ai4ecgeely,zhanghuiminggroup,ai4ec,ai4ecspectr,sunyfgroup,lishaobingroup,huangkgroup,fugroup AllowQos=normal,long
AllocNodes=ALL Default=NO QoS=gpu_qos
DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=0 LLN=NO MaxCPUsPerNode=UNLIMITED
Nodes=gpu[001-006]
PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO
OverTimeLimit=NONE PreemptMode=OFF
State=UP TotalCPUs=384 TotalNodes=6 SelectTypeParameters=NONE
JobDefaults=(null)
DefMemPerCPU=16384 MaxMemPerCPU=24576
PartitionName=fat
AllowGroups=ALL AllowAccounts=ai4ecaig,ai4ecailoc,ai4ecall,ai4ecccg,ai4ecctmig,ai4ececg,ai4eceeg,ai4ecepg,ai4ecmig,ai4ecnimte,baoljgroup,bnulizdgroup,brengroup,caogroup,caoshgroup,caoxrgroup,caoxygroup,caozxgroup,cfdai,chenggroup,chengjungroup,chenhygroup,chenlingroup,chxgroup,cpddai,csygroup,dengxianming,dicpyuliang,dpikkem,duanamgroup,dwzhougroup,fangngroup,fengmingbaogroup,gonglgroup,gxpgroup,hciscgroup,houxugroup,hthiumtest,huanghlgroup,huangjlgroup,huangqlgroup,huangweigroup,huangwengroup,hujungroup,hwjgroup,jfligroup,jinyugroup,kechgroup,lichgroup,lijinggroup,lintianweigroup,liswgroup,liugkgroup,liuhygroup,liyegroup,luoyuanronggroup,luweihuagroup,lvtygroup,maruigroup,maslgroup,mengcgroup,mslgroup,nfanggroup,nfzhenggroup,pavlogroup,qgzhanggroup,qikaigroup,rjxiegroup,shuaiwanggroup,songkaixingroup,sungroup,test,test1,test2,tianxingwugroup,tianygroup,tuzhangroup,txionggroup,ustbhushuxian,wangcgroup,wangjgroup,wangslgroup,wangtinggroup,wbjgroup,wcgroup,wenyhgroup,wucxgroup,wusqgroup,xinlugroup,xmuchemcamp,xmuewccgroup,xmuldk,xuehuijiegroup,yigroup,yijgroup,yixiaodonggroup,youycgroup,yuhrgroup,yushilingroup,ywjianggroup,zenghuabingroup,zhandpgroup,zhanghcgroup,zhangqianggroup,zhangyygroup,zhangzengkaigroup,zhangzhgroup,zhangzhongnangroup,zhaohonggroup,zhaoyungroup,zhengqianggroup,zhengxhgroup,zhouweigroup,zhujungroup,zhuzzgroup,zlonggroup,zpmaogroup,ai4ecxmri,ai4ecgeely,zhanghuiminggroup,ai4ec,ai4ecspectr,sunyfgroup,lishaobingroup,huangkgroup,fugroup AllowQos=normal,long
AllocNodes=ALL Default=YES QoS=N/A
DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=0 LLN=NO MaxCPUsPerNode=UNLIMITED
Nodes=fat[001-002]
PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO
OverTimeLimit=NONE PreemptMode=OFF
State=UP TotalCPUs=128 TotalNodes=2 SelectTypeParameters=NONE
JobDefaults=(null)
DefMemPerCPU=8192 MaxMemPerCPU=32768
PartitionName=dpcpu
AllowGroups=dpikkem AllowAccounts=ai4ecaig,ai4ecailoc,ai4ecall,ai4ecccg,ai4ecctmig,ai4ececg,ai4eceeg,ai4ecepg,ai4ecmig,ai4ecnimte,baoljgroup,bnulizdgroup,brengroup,caogroup,caoshgroup,caoxrgroup,caoxygroup,caozxgroup,cfdai,chenggroup,chengjungroup,chenhygroup,chenlingroup,chxgroup,cpddai,csygroup,dengxianming,dicpyuliang,dpikkem,duanamgroup,dwzhougroup,fangngroup,fengmingbaogroup,gonglgroup,gxpgroup,hciscgroup,houxugroup,hthiumtest,huanghlgroup,huangjlgroup,huangqlgroup,huangweigroup,huangwengroup,hujungroup,hwjgroup,jfligroup,jinyugroup,kechgroup,lichgroup,lijinggroup,lintianweigroup,liswgroup,liugkgroup,liuhygroup,liyegroup,luoyuanronggroup,luweihuagroup,lvtygroup,maruigroup,maslgroup,mengcgroup,mslgroup,nfanggroup,nfzhenggroup,pavlogroup,qgzhanggroup,qikaigroup,rjxiegroup,shuaiwanggroup,songkaixingroup,sungroup,test,test1,test2,tianxingwugroup,tianygroup,tuzhangroup,txionggroup,ustbhushuxian,wangcgroup,wangjgroup,wangslgroup,wangtinggroup,wbjgroup,wcgroup,wenyhgroup,wucxgroup,wusqgroup,xinlugroup,xmuchemcamp,xmuewccgroup,xmuldk,xuehuijiegroup,yigroup,yijgroup,yixiaodonggroup,youycgroup,yuhrgroup,yushilingroup,ywjianggroup,zenghuabingroup,zhandpgroup,zhanghcgroup,zhangqianggroup,zhangyygroup,zhangzengkaigroup,zhangzhgroup,zhangzhongnangroup,zhaohonggroup,zhaoyungroup,zhengqianggroup,zhengxhgroup,zhouweigroup,zhujungroup,zhuzzgroup,zlonggroup,zpmaogroup,ai4ecxmri,ai4ecgeely,zhanghuiminggroup,ai4ec,ai4ecspectr,sunyfgroup,lishaobingroup,huangkgroup,fugroup AllowQos=unlimit
AllocNodes=ALL Default=NO QoS=N/A
DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=0 LLN=NO MaxCPUsPerNode=UNLIMITED
Nodes=cu[001-300]
PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO
OverTimeLimit=NONE PreemptMode=OFF
State=UP TotalCPUs=19200 TotalNodes=300 SelectTypeParameters=NONE
JobDefaults=(null)
DefMemPerCPU=1024 MaxMemPerCPU=4096
PartitionName=dpgpu
AllowGroups=dpikkem AllowAccounts=ai4ecaig,ai4ecailoc,ai4ecall,ai4ecccg,ai4ecctmig,ai4ececg,ai4eceeg,ai4ecepg,ai4ecmig,ai4ecnimte,baoljgroup,bnulizdgroup,brengroup,caogroup,caoshgroup,caoxrgroup,caoxygroup,caozxgroup,cfdai,chenggroup,chengjungroup,chenhygroup,chenlingroup,chxgroup,cpddai,csygroup,dengxianming,dicpyuliang,dpikkem,duanamgroup,dwzhougroup,fangngroup,fengmingbaogroup,gonglgroup,gxpgroup,hciscgroup,houxugroup,hthiumtest,huanghlgroup,huangjlgroup,huangqlgroup,huangweigroup,huangwengroup,hujungroup,hwjgroup,jfligroup,jinyugroup,kechgroup,lichgroup,lijinggroup,lintianweigroup,liswgroup,liugkgroup,liuhygroup,liyegroup,luoyuanronggroup,luweihuagroup,lvtygroup,maruigroup,maslgroup,mengcgroup,mslgroup,nfanggroup,nfzhenggroup,pavlogroup,qgzhanggroup,qikaigroup,rjxiegroup,shuaiwanggroup,songkaixingroup,sungroup,test,test1,test2,tianxingwugroup,tianygroup,tuzhangroup,txionggroup,ustbhushuxian,wangcgroup,wangjgroup,wangslgroup,wangtinggroup,wbjgroup,wcgroup,wenyhgroup,wucxgroup,wusqgroup,xinlugroup,xmuchemcamp,xmuewccgroup,xmuldk,xuehuijiegroup,yigroup,yijgroup,yixiaodonggroup,youycgroup,yuhrgroup,yushilingroup,ywjianggroup,zenghuabingroup,zhandpgroup,zhanghcgroup,zhangqianggroup,zhangyygroup,zhangzengkaigroup,zhangzhgroup,zhangzhongnangroup,zhaohonggroup,zhaoyungroup,zhengqianggroup,zhengxhgroup,zhouweigroup,zhujungroup,zhuzzgroup,zlonggroup,zpmaogroup,ai4ecxmri,ai4ecgeely,zhanghuiminggroup,ai4ec,ai4ecspectr,sunyfgroup,lishaobingroup,huangkgroup,fugroup AllowQos=unlimit
AllocNodes=ALL Default=NO QoS=N/A
DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=0 LLN=NO MaxCPUsPerNode=UNLIMITED
Nodes=gpu[001-006]
PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO
OverTimeLimit=NONE PreemptMode=OFF
State=UP TotalCPUs=384 TotalNodes=6 SelectTypeParameters=NONE
JobDefaults=(null)
DefMemPerCPU=16384 MaxMemPerCPU=24576
以及 sacctmgr show qos format=Name,Partition
的输出(为了方便查看我这边同时把MaxWall
和MinTRES%20
也导出了)
# sacctmgr show qos format=Name,Partition,MaxWall,MinTRES%20
Name Partition MaxWall MinTRES
---------- ---------- ----------- --------------------
normal 2-00:00:00
long 4-00:00:00
unlimit
gpu_qos gres/gpu:tesla=1
感谢您的回复。从您分区信息可以看到,PartitionName=gpu
时AllowQos=normal, long, Qos=gpu_qos
这个是不合理的配置,会导致读取分区qos时出现错误。
为了解决您现在的问题,我们建议您不指定Qos
的值,即仍然配置为N/A
;且需要让AllowQos= normal, long, gpu_qos
,请确认正确配置Qos后是否仍然有上述分区Qos价格配置问题或计费问题存在。
如果您有指定默认Qos的需求,我们会考虑后续对此做出完善。
感谢您的回复。从您分区信息可以看到,
PartitionName=gpu
时AllowQos=normal, long, Qos=gpu_qos
这个是不合理的配置,会导致读取分区qos时出现错误。为了解决您现在的问题,我们建议您不指定
Qos
的值,即仍然配置为N/A
;且需要让AllowQos= normal, long, gpu_qos
,请确认正确配置Qos后是否仍然有上述分区Qos价格配置问题或计费问题存在。如果您有指定默认Qos的需求,我们会考虑后续对此做出完善。
但正如本issue的introduction部分介绍,gpu_qos
的作用是确保用户在使用GPU队列时最少需要指定1张GPU卡,与另外两个QOS彼此独立,且CPU队列不需要这一设置。Partition QOS和Job QOS在Slurm官方文档中的说明关系也是如此。
我们的需求并非指定默认QoS,而是希望对这个队列指定一个独立于其他队列的政策要求。
感谢您对问题的补充。
首先,当前OpenSCOW中,我们默认在页面提交作业/创建交互式应用使用GPU分区时,至少需要选择1张以上的GPU卡; 同时我们也建议针对特殊的分区制定策略时,对整个分区进行配置,例如直接对您提到的GPU分区做出MinTres策略限制
其次,针对您现在遇到的问题,我们当前不支持对Partition下的AllowQos和Qos分别指定,如果您想给GPU分区制定独立于全局的normal 和 long 的qos, 建议您可以通过单独给对应分区指定 normal-long-qos, long-gpu-qos的AllowQos
请确认上述回复能否解决您的问题。
最后,您提到后台作业仍然正常扣费,请您帮助确认该扣费是否是在租户管理和平台管理下未对 GPU分区的 任何qos设置价格时发生的扣费?在您分区配置为 Qos = gpu-qos, AllowQos= normal, long的条件下,集群下写入数据库并发生扣费的 GPU分区下的 Qos 为那个 Qos
感谢您对问题的补充。
首先,当前OpenSCOW中,我们默认在页面提交作业/创建交互式应用使用GPU分区时,至少需要选择1张以上的GPU卡; 同时我们也建议针对特殊的分区制定策略时,对整个分区进行配置,例如直接对您提到的GPU分区做出MinTres策略限制
其次,针对您现在遇到的问题,我们当前不支持对Partition下的AllowQos和Qos分别指定,如果您想给GPU分区制定独立于全局的normal 和 long 的qos, 建议您可以通过单独给对应分区指定 normal-long-qos, long-gpu-qos的AllowQos
请确认上述回复能否解决您的问题。
最后,您提到后台作业仍然正常扣费,请您帮助确认该扣费是否是在租户管理和平台管理下未对 GPU分区的 任何qos设置价格时发生的扣费?在您分区配置为 Qos = gpu-qos, AllowQos= normal, long的条件下,集群下写入数据库并发生扣费的 GPU分区下的 Qos 为那个 Qos
首先谢谢您的回答。
关于前者,首先用户的使用习惯大部分时候还是会通过命令行创建作业,因此我们需要在Slurm层面上做限制。第二,Slurm不支持对分区设置MinTres策略限制,仅可通过配置分区QOS来实现,这点我想或许也是分区QOS功能存在的目的。
独立于原本设置创建新的QOS则会要求用户改变使用习惯,从运营角度我们自然希望尽可能不影响用户,因此在实践上这类影响用户的变更,我们需要内部进一步讨论决定。而从技术上来说,在Slurm策略设置上既然存在推荐的解决方案,削足适履可能也并不一定是最优解。
然后关于后者,我们确实还没有做新设置的变更,即维持了原本GPU分区对应normal和long QOS的定价。目前也是原打算做变更的时候发现这一问题的存在。
分区确实不直接支持MinTres设置,感谢提醒。
关于计费推测由于您可能没有在变更Qos配置时重启OpenScow,数据库中保留了原有计费规则,所以导致如果可以正常使用normal和long的 Qos的情况提交作业时,仍然使用了原有计费规则。
目前我们在Slurm适配器中采用的策略是如果设置了PartitionQos默认情况是让用户只使用PartitionQos来提交作业,所以这时只支持对PartitionQos来设置计费规则 如果没有设置PartitionQos的情况,才支持对AllowQos下所有Qos设置计费规则
如果有迫切的需要您可以在我们提供的Slurm开源适配器的基础上进行改动满足您使用的需求 我们也非常感谢您宝贵的意见,会基于此进一步内部讨论优化更全面的Slurm设置的实现
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
发生了什么 | What happened
由于内部管理的需要,我们通过Slurm对GPU分区设置了分区QOS(
gpu_qos
),独立于全局的作业QOS(normal
和long
),主要用于限制用户每次提交GPU作业至少申请1张GPU卡,尽可能提高效率。由于全局QOS同时作用于CPU分区,故该设置无法在作业QOS中分别设置,在实践上只能采用上述策略。但如图所示,在设置了该QOS后,SCOW系统中无法对GPU分区计费分别针对
normal
和long
进行设置,且查询后台发现系统仍在正常扣费。期望结果 | What did you expect to happen
可以正确对
normal
和long
QOS分别设置计费之前运行正常吗? | Did this work before?
正常。v1.6.3
复现方法 | Steps To Reproduce
进入SCOW的作业价格表设置
运行环境 | Environment
备注 | Anything else?
No response