Project-HAMi / HAMi

Heterogeneous AI Computing Virtualization Middleware
http://project-hami.io/
Apache License 2.0
963 stars 199 forks source link

Is it compatible with different driver versions and cuda versions #482

Open 15929482853 opened 2 months ago

15929482853 commented 2 months ago

What happened:All previous Gpus of the cluster were 515 version of the driver and cuda11.7.Rencently I add a machine with L20(only support driver 535 at least and cuda12, then I ran into a problem that the gpus were not recognized correctly: 企业微信截图_17258626616133 企业微信截图_17258627199966

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

archlitchi commented 2 months ago

can you re-submit the task with env 'CUDA_DISABLE_CONTROL'=true , and see if it reproduces this error?