cBio / cbio-cluster

MSKCC cBio cluster documentation
12 stars 2 forks source link

Cuda compute capability Torque property #416

Closed tatarsky closed 8 years ago

tatarsky commented 8 years ago

This has come up tangentially I believe in a few matters. We currently mark GPU system Torque properties by the type of card (gtx680, gtxtitan, gtx780ti, gtx980, telsa, gtxtitanx).

It probably makes sense to also tag the property for the cards MAXIMUM CUDA compute level in some easy to use scheme.

Based on the data I'm grabbing from Nvidia but is nicely summarized here:

https://en.wikipedia.org/wiki/CUDA#GPUs_supported

my proposed method would be for ALL cards add the following torque resource and I'm removing the "." because I don't want it confusing the resource parser.

cuda30 -> Supports up to compute capability 3.0 (all our cards I believe do)

Then for all the cards EXCEPT the gtx680 we would add this additional resource:

cuda35 -> Supports up to compute capability 3.5

And then for I believe the gtx980 and gtxtitanx by this scheme we would add:

cuda52 -> Supports up to compute capability 5.2

This is a lower priority matter which I am using to train local resources in Torque but if you comment I will take the comments and place them in the hpc-request ticket for that training.

jchodera commented 8 years ago

+1, except I wouldn't use the format cudaXY so as to avoid confusion with CUDA version X.Y (e.g. we're on CUDA 7.5 now). How about cudacomputeXY or just cudaccXY?

tatarsky commented 8 years ago

Ah valid point. How about cudaccXY. Is "Compute Capability" the actual name of this concept?

jchodera commented 8 years ago

Yep, "CUDA compute capability X.Y" is the right term. Sounds great!

tatarsky commented 8 years ago

I have added these properties. If you wish to validate you can use this rather ugly grep to make sure I've done what I said fairly readable or just review all the pbsnodes lines containing "properties" to make sure the cudaccXX properties match the card type property.

pbsnodes |egrep "^\w+|properties"