mlcommons / ck

Collective Knowledge (CK) is an educational community project to learn how to run AI, ML and other emerging workloads in the most efficient and cost-effective way across diverse models, data sets, software and hardware using MLCommons CM (Collective Mind workflow automation framework)
https://cKnowledge.org
Apache License 2.0
607 stars 114 forks source link

get-cuda is not respecting requested version #873

Closed arjunsuresh closed 9 months ago

arjunsuresh commented 1 year ago

get,cuda is not respecting the given version when a different cuda version is already detected in the workflow.

* cm run script "get cuda"
      - Number of scripts found: 1
      - Searching for cached script outputs with the following tags: -tmp,get,cuda
        - Number of cached script outputs found: 2
      - Found script::get-cuda,46d133d9ef92422d in /home/cmuser/CM/repos/mlcommons@ck/cm-mlops/script/get-cuda
        Prepared variations: _toolkit
        - Requested version:  == 11.8.0   >= 11.8.0   <= 11.8.0
      - Checking if script execution is already cached ...
        - Prepared variations: _toolkit
        - Searching for cached script outputs with the following tags: -tmp,get,cuda,cuda-compiler,cuda-lib,toolkit,lib,nvcc,get-nvcc,get-cuda,_toolkit,version-11.8.0
      - Creating new "cache" script artifact in the CM local repository ...
        - Tags: tmp,get,cuda,cuda-compiler,cuda-lib,toolkit,lib,nvcc,get-nvcc,get-cuda,_toolkit,version-11.8.0,script-artifact-46d133d9ef92422d
      - Changing to /home/cmuser/CM/repos/local/cache/b22714fa300c45ce
        # potential PIP version string (if needed): ==11.8.0
      - Running preprocess ...
      - Checking prehook dependencies on other CM scripts:

      - Running native script "/home/cmuser/CM/repos/mlcommons@ck/cm-mlops/script/get-cuda/run.sh" from temporal script "tmp-run.sh" in "/home/cmuser/CM/repos/local/cache/b22714fa300c45ce" ...

      - Running postprocess ...
        Detected version: 12.2
      - Removing tmp tag in the script cached output b22714fa300c45ce ...
      - cache UID: b22714fa300c45ce
      - running time of script "get,cuda,cuda-compiler,cuda-lib,toolkit,lib,nvcc,get-nvcc,get-cuda": 0.17 sec.
arjunsuresh commented 9 months ago

I believe this is fixed now.