mlcommons / inference

Reference implementations of MLPerf™ inference benchmarks
https://mlcommons.org/en/groups/inference
Apache License 2.0
1.23k stars 533 forks source link

sudo access needed to run cm run script "get sys-utils-cm" #1846

Open writingindy opened 1 month ago

writingindy commented 1 month ago

Hi,

This is my attached cm-repro file cm-repro.zip

I'm trying to run the MLPerf Reference Implementation for bert-large at https://docs.mlcommons.org/inference/benchmarks/language/bert/ and I'm running into issues while trying to run

cm run script --tags=run-mlperf,inference,_find-performance,_full,_r4.1-dev \
   --model=bert-99 \
   --implementation=reference \
   --framework=pytorch \
   --category=edge \
   --scenario=Offline \
   --execution_mode=test \
   --device=cpu  \
   --quiet \
   --test_query_count=100

I see in the terminal that it's asking me for my sudo password, but I don't have sudo access on this cluster. Here's the output:

INFO:root:* cm run script "run-mlperf inference _find-performance _full _r4.1-dev"
INFO:root:  * cm run script "detect os"
INFO:root:         ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:         ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/run.sh from tmp-run.sh
INFO:root:         ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/customize.py
INFO:root:  * cm run script "detect cpu"
INFO:root:    * cm run script "detect os"
INFO:root:           ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:           ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/run.sh from tmp-run.sh
INFO:root:           ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/customize.py
INFO:root:         ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:         ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-cpu/run.sh from tmp-run.sh
INFO:root:         ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-cpu/customize.py
INFO:root:  * cm run script "get python3"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/4f9e6fe06cd94940/cm-cached-state.json
INFO:root:Path to Python: /u1/wk5ng/CM/repos/local/cache/7de49cf355794755/mlperf/bin/python3
INFO:root:Python version: 3.11.4
INFO:root:  * cm run script "get mlcommons inference src"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/639bf66ba64d46e2/cm-cached-state.json
INFO:root:  * cm run script "get sut description"
INFO:root:    * cm run script "detect os"
INFO:root:           ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:           ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/run.sh from tmp-run.sh
INFO:root:           ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/customize.py
INFO:root:    * cm run script "detect cpu"
INFO:root:      * cm run script "detect os"
INFO:root:             ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:             ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/run.sh from tmp-run.sh
INFO:root:             ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/customize.py
INFO:root:           ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:           ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-cpu/run.sh from tmp-run.sh
INFO:root:           ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-cpu/customize.py
INFO:root:    * cm run script "get python3"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/4f9e6fe06cd94940/cm-cached-state.json
INFO:root:Path to Python: /u1/wk5ng/CM/repos/local/cache/7de49cf355794755/mlperf/bin/python3
INFO:root:Python version: 3.11.4
INFO:root:    * cm run script "get compiler"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/6e0d265390494dc3/cm-cached-state.json
INFO:root:    * cm run script "get generic-python-lib _package.dmiparser"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/1404f366b8494c6e/cm-cached-state.json
INFO:root:    * cm run script "get cache dir _name.mlperf-inference-sut-descriptions"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/b0193ec77b0842dc/cm-cached-state.json
Generating SUT description file for watgpu408-pytorch
INFO:root:         ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/get-mlperf-inference-sut-description/customize.py
INFO:root:  * cm run script "get mlperf inference results dir"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/087330972af64098/cm-cached-state.json
INFO:root:  * cm run script "install pip-package for-cmind-python _package.tabulate"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/5ae9aab0c8ea46dd/cm-cached-state.json
INFO:root:  * cm run script "get mlperf inference utils"
INFO:root:    * cm run script "get mlperf inference src"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/639bf66ba64d46e2/cm-cached-state.json
INFO:root:         ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/get-mlperf-inference-utils/customize.py
Using MLCommons Inference source from /u1/wk5ng/CM/repos/local/cache/5584207b30dd4fc1/inference

Running loadgen scenario: Offline and mode: performance
INFO:root:* cm run script "app mlperf inference generic _reference _bert-99 _pytorch _cpu _test _r4.1-dev_default _offline"
INFO:root:  * cm run script "detect os"
INFO:root:         ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:         ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/run.sh from tmp-run.sh
INFO:root:         ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/customize.py
INFO:root:  * cm run script "get sys-utils-cm"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/69c01310f61e445c/cm-cached-state.json
INFO:root:  * cm run script "get python"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/4f9e6fe06cd94940/cm-cached-state.json
INFO:root:Path to Python: /u1/wk5ng/CM/repos/local/cache/7de49cf355794755/mlperf/bin/python3
INFO:root:Python version: 3.11.4
INFO:root:  * cm run script "get mlcommons inference src _deeplearningexamples"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/639bf66ba64d46e2/cm-cached-state.json
INFO:root:  * cm run script "get mlperf inference utils"
INFO:root:    * cm run script "get mlperf inference src _deeplearningexamples"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/639bf66ba64d46e2/cm-cached-state.json
INFO:root:         ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/get-mlperf-inference-utils/customize.py
INFO:root:  * cm run script "get dataset squad language-processing"
INFO:root:    * cm run script "get sys-utils-cm"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/69c01310f61e445c/cm-cached-state.json
INFO:root:    * cm run script "get sys-util generic generic-sys-util _wget"
INFO:root:      * cm run script "detect os"
INFO:root:             ! cd /u1/wk5ng/CM/repos/local/cache/ad9e50b30233426f
INFO:root:             ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/run.sh from tmp-run.sh
INFO:root:             ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/customize.py
INFO:root:           ! cd /u1/wk5ng/CM/repos/local/cache/ad9e50b30233426f
INFO:root:           ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/get-generic-sys-util/run.sh from tmp-run.sh
sudo DEBIAN_FRONTEND=noninteractive apt-get install -y wget
[sudo] password for wk5ng: 

Sorry, try again.
[sudo] password for wk5ng: 

Sorry, try again.
[sudo] password for wk5ng: 

sudo: 3 incorrect password attempts

CM error: Portable CM script failed (name = get-generic-sys-util, return code = 256)

I see that it's trying to install wget, but when I run wget -h I see that it's installed on the system. Is there anyway to bypass this step?

Thank you so much, Indy Ng

arjunsuresh commented 1 month ago

We'll add a cleaner way to handle this but for now can you please comment out the line needing sudo from the below file?

/u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/get-generic-sys-util/run.sh
writingindy commented 1 month ago

Commenting out the line makes the code run fine, but when trying to look at the accuracy file I don't see anything.

(cm-venv) wk5ng@watgpu408:~/finetune-inference-experiments$ cat /u1/wk5ng/CM/repos/local/cache/087330972af64098/valid_results/watgpu408-reference-cpu-pytorch-v2.4.1-default_config/bert-99/offline/accuracy/accuracy.txt
Reading examples...
No cached features at '/u1/wk5ng/CM/repos/local/cache/5584207b30dd4fc1/inference/language/bert/eval_features.pickle'... converting from examples...
Creating tokenizer...
Converting examples to features...
Caching features at '/u1/wk5ng/CM/repos/local/cache/5584207b30dd4fc1/inference/language/bert/eval_features.pickle'...
Loading LoadGen logs...

Additionally, I was expecting some benchmark results from running the code, but there doesn't seem to be anything from the output of running the code:

(cm-venv) wk5ng@watgpu408:~/finetune-inference-experiments$ cm run script --tags=run-mlperf,inference,_r4.1-dev    --model=bert-99    --implementation=reference    --framework=pytorch    --category=datacenter    --scenario=Offline    --execution_mode=valid    --device=cpu
INFO:root:* cm run script "run-mlperf inference _r4.1-dev"
INFO:root:  * cm run script "detect os"
INFO:root:         ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:         ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/run.sh from tmp-run.sh
INFO:root:         ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/customize.py
INFO:root:  * cm run script "detect cpu"
INFO:root:    * cm run script "detect os"
INFO:root:           ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:           ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/run.sh from tmp-run.sh
INFO:root:           ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/customize.py
INFO:root:         ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:         ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-cpu/run.sh from tmp-run.sh
INFO:root:         ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-cpu/customize.py
INFO:root:  * cm run script "get python3"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/4f9e6fe06cd94940/cm-cached-state.json
INFO:root:Path to Python: /u1/wk5ng/CM/repos/local/cache/7de49cf355794755/mlperf/bin/python3
INFO:root:Python version: 3.11.4
INFO:root:  * cm run script "get mlcommons inference src"
INFO:root:      - More than 1 cached script output found for "get,mlcommons,inference,src":
INFO:root:        0) /u1/wk5ng/CM/repos/local/cache/639bf66ba64d46e2 (get,mlcommons,inference,src,source,inference-src,inference-source,mlperf,_deeplearningexamples,script-artifact-4b57186581024797,version-master-git-81c2de69de4af90410cd1ba000fc5bd731bf6dee) (Version master-git-81c2de69de4af90410cd1ba000fc5bd731bf6dee)
INFO:root:        1) /u1/wk5ng/CM/repos/local/cache/d2080f3d33d84a96 (get,mlcommons,inference,src,source,inference-src,inference-source,mlperf,script-artifact-4b57186581024797,version-master-git-81c2de69de4af90410cd1ba000fc5bd731bf6dee) (Version master-git-81c2de69de4af90410cd1ba000fc5bd731bf6dee)
INFO:root:        2) /u1/wk5ng/CM/repos/local/cache/522e9cb8872245b0 (get,mlcommons,inference,src,source,inference-src,inference-source,mlperf,script-artifact-4b57186581024797,version-master-git-81c2de69de4af90410cd1ba000fc5bd731bf6dee) (Version master-git-81c2de69de4af90410cd1ba000fc5bd731bf6dee)
        Make your selection or press Enter for 0 or use -1 to skip: 
INFO:root:        Selected 0: /u1/wk5ng/CM/repos/local/cache/639bf66ba64d46e2
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/639bf66ba64d46e2/cm-cached-state.json
INFO:root:  * cm run script "get sut description"
INFO:root:    * cm run script "detect os"
INFO:root:           ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:           ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/run.sh from tmp-run.sh
INFO:root:           ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/customize.py
INFO:root:    * cm run script "detect cpu"
INFO:root:      * cm run script "detect os"
INFO:root:             ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:             ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/run.sh from tmp-run.sh
INFO:root:             ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/customize.py
INFO:root:           ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:           ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-cpu/run.sh from tmp-run.sh
INFO:root:           ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-cpu/customize.py
INFO:root:    * cm run script "get python3"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/4f9e6fe06cd94940/cm-cached-state.json
INFO:root:Path to Python: /u1/wk5ng/CM/repos/local/cache/7de49cf355794755/mlperf/bin/python3
INFO:root:Python version: 3.11.4
INFO:root:    * cm run script "get compiler"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/6e0d265390494dc3/cm-cached-state.json
INFO:root:    * cm run script "get generic-python-lib _package.dmiparser"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/1404f366b8494c6e/cm-cached-state.json
INFO:root:    * cm run script "get cache dir _name.mlperf-inference-sut-descriptions"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/b0193ec77b0842dc/cm-cached-state.json
Generating SUT description file for watgpu408-pytorch
INFO:root:         ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/get-mlperf-inference-sut-description/customize.py
INFO:root:  * cm run script "get mlperf inference results dir"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/087330972af64098/cm-cached-state.json
INFO:root:  * cm run script "install pip-package for-cmind-python _package.tabulate"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/5ae9aab0c8ea46dd/cm-cached-state.json
INFO:root:  * cm run script "get mlperf inference utils"
INFO:root:    * cm run script "get mlperf inference src"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/639bf66ba64d46e2/cm-cached-state.json
INFO:root:         ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/get-mlperf-inference-utils/customize.py
Using MLCommons Inference source from /u1/wk5ng/CM/repos/local/cache/5584207b30dd4fc1/inference

Running loadgen scenario: Offline and mode: performance
INFO:root:* cm run script "app mlperf inference generic _reference _bert-99 _pytorch _cpu _valid _r4.1-dev_default _offline"
INFO:root:  * cm run script "detect os"
INFO:root:         ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:         ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/run.sh from tmp-run.sh
INFO:root:         ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/customize.py
INFO:root:  * cm run script "get sys-utils-cm"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/69c01310f61e445c/cm-cached-state.json
INFO:root:  * cm run script "get python"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/4f9e6fe06cd94940/cm-cached-state.json
INFO:root:Path to Python: /u1/wk5ng/CM/repos/local/cache/7de49cf355794755/mlperf/bin/python3
INFO:root:Python version: 3.11.4
INFO:root:  * cm run script "get mlcommons inference src _deeplearningexamples"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/639bf66ba64d46e2/cm-cached-state.json
INFO:root:  * cm run script "get mlperf inference utils"
INFO:root:    * cm run script "get mlperf inference src _deeplearningexamples"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/639bf66ba64d46e2/cm-cached-state.json
INFO:root:         ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/get-mlperf-inference-utils/customize.py
INFO:root:  * cm run script "get dataset squad language-processing"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/c4b715419f674d1a/cm-cached-state.json
INFO:root:  * cm run script "get dataset-aux squad-vocab"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/a7a6322d10044b28/cm-cached-state.json
INFO:root:  * cm run script "app mlperf reference inference _offline _pytorch _bert-99 _cpu _fp32"
INFO:root:    * cm run script "detect os"
INFO:root:           ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:           ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/run.sh from tmp-run.sh
INFO:root:           ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/customize.py
INFO:root:    * cm run script "detect cpu"
INFO:root:      * cm run script "detect os"
INFO:root:             ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:             ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/run.sh from tmp-run.sh
INFO:root:             ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/customize.py
INFO:root:           ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:           ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-cpu/run.sh from tmp-run.sh
INFO:root:           ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-cpu/customize.py
INFO:root:    * cm run script "get sys-utils-cm"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/69c01310f61e445c/cm-cached-state.json
INFO:root:    * cm run script "get python"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/4f9e6fe06cd94940/cm-cached-state.json
INFO:root:Path to Python: /u1/wk5ng/CM/repos/local/cache/7de49cf355794755/mlperf/bin/python3
INFO:root:Python version: 3.11.4
INFO:root:    * cm run script "get generic-python-lib _torch"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/cf7142057d454f49/cm-cached-state.json
INFO:root:    * cm run script "get generic-python-lib _torchvision"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/4c593ccb2d4445fd/cm-cached-state.json
INFO:root:    * cm run script "get generic-python-lib _transformers"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/49fb733a36a940d9/cm-cached-state.json
INFO:root:    * cm run script "get ml-model language-processing bert-large raw _pytorch _fp32"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/3f030dfd722f4422/cm-cached-state.json
INFO:root:Path to the ML model: /u1/wk5ng/CM/repos/local/cache/6549ad7fb82a4355/model.pytorch
INFO:root:    * cm run script "get dataset squad original"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/c4b715419f674d1a/cm-cached-state.json
INFO:root:    * cm run script "get dataset-aux squad-vocab"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/a7a6322d10044b28/cm-cached-state.json
INFO:root:    * cm run script "generate user-conf mlperf inference"
INFO:root:      * cm run script "detect os"
INFO:root:             ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:             ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/run.sh from tmp-run.sh
INFO:root:             ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/customize.py
INFO:root:      * cm run script "detect cpu"
INFO:root:        * cm run script "detect os"
INFO:root:               ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:               ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/run.sh from tmp-run.sh
INFO:root:               ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/customize.py
INFO:root:             ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:             ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-cpu/run.sh from tmp-run.sh
INFO:root:             ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-cpu/customize.py
INFO:root:      * cm run script "get python"
INFO:root:           ! load /u1/wk5ng/CM/repos/local/cache/4f9e6fe06cd94940/cm-cached-state.json
INFO:root:Path to Python: /u1/wk5ng/CM/repos/local/cache/7de49cf355794755/mlperf/bin/python3
INFO:root:Python version: 3.11.4
INFO:root:      * cm run script "get mlcommons inference src _deeplearningexamples"
INFO:root:           ! load /u1/wk5ng/CM/repos/local/cache/639bf66ba64d46e2/cm-cached-state.json
INFO:root:      * cm run script "get sut configs"
INFO:root:        * cm run script "get cache dir _name.mlperf-inference-sut-configs"
INFO:root:             ! load /u1/wk5ng/CM/repos/local/cache/182783ed080d4d5b/cm-cached-state.json
INFO:root:             ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/get-mlperf-inference-sut-configs/customize.py
Using MLCommons Inference source from '/u1/wk5ng/CM/repos/local/cache/5584207b30dd4fc1/inference'
Original configuration value 1.0 target_qps
Adjusted configuration value 1.01 target_qps
Output Dir: '/u1/wk5ng/CM/repos/local/cache/087330972af64098/valid_results/watgpu408-reference-cpu-pytorch-v2.4.1-default_config/bert-99/offline/performance/run_1'
bert.Offline.target_qps = 1.01

INFO:root:    * cm run script "get loadgen"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/23782424b9244dec/cm-cached-state.json
INFO:root:Path to the tool: /u1/wk5ng/CM/repos/local/cache/23782424b9244dec/install
INFO:root:    * cm run script "get mlcommons inference src _deeplearningexamples"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/639bf66ba64d46e2/cm-cached-state.json
INFO:root:    * cm run script "get mlcommons inference src"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/639bf66ba64d46e2/cm-cached-state.json
INFO:root:    * cm run script "get generic-python-lib _package.psutil"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/a86f70abbbe34f04/cm-cached-state.json
INFO:root:    * cm run script "get generic-python-lib _package.pydantic"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/8d44830a5f9f441b/cm-cached-state.json
INFO:root:    * cm run script "get generic-python-lib _tokenization"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/8f488225204f40d0/cm-cached-state.json
INFO:root:    * cm run script "get generic-python-lib _six"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/ab8328266eb24e53/cm-cached-state.json
INFO:root:    * cm run script "get generic-python-lib _package.absl-py"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/0140da6782fc414b/cm-cached-state.json
INFO:root:    * cm run script "get generic-python-lib _boto3"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/12e06773410c4db3/cm-cached-state.json
Using MLCommons Inference source from '/u1/wk5ng/CM/repos/local/cache/5584207b30dd4fc1/inference'
INFO:root:         ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/app-mlperf-inference-mlcommons-python/customize.py
INFO:root:  * cm run script "benchmark-mlperf"
INFO:root:         ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/benchmark-program-mlperf/customize.py
INFO:root:  * cm run script "benchmark-program program"
INFO:root:    * cm run script "detect cpu"
INFO:root:      * cm run script "detect os"
INFO:root:             ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:             ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/run.sh from tmp-run.sh
INFO:root:             ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/customize.py
INFO:root:           ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:           ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-cpu/run.sh from tmp-run.sh
INFO:root:           ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-cpu/customize.py
***************************************************************************
CM script::benchmark-program/run.sh

Run Directory: /u1/wk5ng/CM/repos/local/cache/5584207b30dd4fc1/inference/language/bert

CMD: /u1/wk5ng/CM/repos/local/cache/7de49cf355794755/mlperf/bin/python3 run.py --backend=pytorch --scenario=Offline   --mlperf_conf '/u1/wk5ng/CM/repos/local/cache/5584207b30dd4fc1/inference/mlperf.conf' --user_conf '/u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/generate-mlperf-inference-user-conf/tmp/d19e81be3a6c4a3b8481abb19ac3cc41.conf' 2>&1 ; echo \$? > exitstatus | tee '/u1/wk5ng/CM/repos/local/cache/087330972af64098/valid_results/watgpu408-reference-cpu-pytorch-v2.4.1-default_config/bert-99/offline/performance/run_1/console.out'

INFO:root:         ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:         ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/benchmark-program/run-ubuntu.sh from tmp-run.sh

/u1/wk5ng/CM/repos/local/cache/7de49cf355794755/mlperf/bin/python3 run.py --backend=pytorch --scenario=Offline   --mlperf_conf '/u1/wk5ng/CM/repos/local/cache/5584207b30dd4fc1/inference/mlperf.conf' --user_conf '/u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/generate-mlperf-inference-user-conf/tmp/d19e81be3a6c4a3b8481abb19ac3cc41.conf' 2>&1 ; echo $? > exitstatus | tee '/u1/wk5ng/CM/repos/local/cache/087330972af64098/valid_results/watgpu408-reference-cpu-pytorch-v2.4.1-default_config/bert-99/offline/performance/run_1/console.out'
Traceback (most recent call last):
  File "/u1/wk5ng/CM/repos/local/cache/5584207b30dd4fc1/inference/language/bert/run.py", line 19, in <module>
    import mlperf_loadgen as lg
ImportError: /opt/anaconda3/bin/../lib/libstdc++.so.6: version `GLIBCXX_3.4.30' not found (required by /u1/wk5ng/CM/repos/local/cache/23782424b9244dec/install/python/mlperf_loadgen.cpython-311-x86_64-linux-gnu.so)
INFO:root:         ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/benchmark-program/customize.py
INFO:root:  * cm run script "save mlperf inference state"
INFO:root:         ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/save-mlperf-inference-implementation-state/customize.py
INFO:root:       ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:       ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/app-mlperf-inference/run.sh from tmp-run.sh
INFO:root:       ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/app-mlperf-inference/customize.py
INFO:root:* cm run script "get mlperf sut description"
INFO:root:  * cm run script "detect os"
INFO:root:         ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:         ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/run.sh from tmp-run.sh
INFO:root:         ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/customize.py
INFO:root:  * cm run script "detect cpu"
INFO:root:    * cm run script "detect os"
INFO:root:           ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:           ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/run.sh from tmp-run.sh
INFO:root:           ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/customize.py
INFO:root:         ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:         ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-cpu/run.sh from tmp-run.sh
INFO:root:         ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-cpu/customize.py
INFO:root:  * cm run script "get python3"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/4f9e6fe06cd94940/cm-cached-state.json
INFO:root:Path to Python: /u1/wk5ng/CM/repos/local/cache/7de49cf355794755/mlperf/bin/python3
INFO:root:Python version: 3.11.4
INFO:root:  * cm run script "get compiler"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/6e0d265390494dc3/cm-cached-state.json
INFO:root:  * cm run script "get generic-python-lib _package.dmiparser"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/1404f366b8494c6e/cm-cached-state.json
INFO:root:  * cm run script "get cache dir _name.mlperf-inference-sut-descriptions"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/b0193ec77b0842dc/cm-cached-state.json
Generating SUT description file for watgpu408-pytorch-2.4.1
INFO:root:       ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/get-mlperf-inference-sut-description/customize.py

Running loadgen scenario: Offline and mode: accuracy
INFO:root:* cm run script "app mlperf inference generic _reference _bert-99 _pytorch _cpu _valid _r4.1-dev_default _offline"
INFO:root:  * cm run script "detect os"
INFO:root:         ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:         ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/run.sh from tmp-run.sh
INFO:root:         ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/customize.py
INFO:root:  * cm run script "get sys-utils-cm"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/69c01310f61e445c/cm-cached-state.json
INFO:root:  * cm run script "get python"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/4f9e6fe06cd94940/cm-cached-state.json
INFO:root:Path to Python: /u1/wk5ng/CM/repos/local/cache/7de49cf355794755/mlperf/bin/python3
INFO:root:Python version: 3.11.4
INFO:root:  * cm run script "get mlcommons inference src _deeplearningexamples"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/639bf66ba64d46e2/cm-cached-state.json
INFO:root:  * cm run script "get mlperf inference utils"
INFO:root:    * cm run script "get mlperf inference src _deeplearningexamples"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/639bf66ba64d46e2/cm-cached-state.json
INFO:root:         ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/get-mlperf-inference-utils/customize.py
INFO:root:  * cm run script "get dataset squad language-processing"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/c4b715419f674d1a/cm-cached-state.json
INFO:root:  * cm run script "get dataset-aux squad-vocab"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/a7a6322d10044b28/cm-cached-state.json
INFO:root:  * cm run script "app mlperf reference inference _offline _pytorch _bert-99 _cpu _fp32"
INFO:root:    * cm run script "detect os"
INFO:root:           ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:           ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/run.sh from tmp-run.sh
INFO:root:           ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/customize.py
INFO:root:    * cm run script "detect cpu"
INFO:root:      * cm run script "detect os"
INFO:root:             ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:             ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/run.sh from tmp-run.sh
INFO:root:             ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/customize.py
INFO:root:           ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:           ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-cpu/run.sh from tmp-run.sh
INFO:root:           ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-cpu/customize.py
INFO:root:    * cm run script "get sys-utils-cm"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/69c01310f61e445c/cm-cached-state.json
INFO:root:    * cm run script "get python"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/4f9e6fe06cd94940/cm-cached-state.json
INFO:root:Path to Python: /u1/wk5ng/CM/repos/local/cache/7de49cf355794755/mlperf/bin/python3
INFO:root:Python version: 3.11.4
INFO:root:    * cm run script "get generic-python-lib _torch"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/cf7142057d454f49/cm-cached-state.json
INFO:root:    * cm run script "get generic-python-lib _torchvision"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/4c593ccb2d4445fd/cm-cached-state.json
INFO:root:    * cm run script "get generic-python-lib _transformers"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/49fb733a36a940d9/cm-cached-state.json
INFO:root:    * cm run script "get ml-model language-processing bert-large raw _pytorch _fp32"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/3f030dfd722f4422/cm-cached-state.json
INFO:root:Path to the ML model: /u1/wk5ng/CM/repos/local/cache/6549ad7fb82a4355/model.pytorch
INFO:root:    * cm run script "get dataset squad original"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/c4b715419f674d1a/cm-cached-state.json
INFO:root:    * cm run script "get dataset-aux squad-vocab"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/a7a6322d10044b28/cm-cached-state.json
INFO:root:    * cm run script "generate user-conf mlperf inference"
INFO:root:      * cm run script "detect os"
INFO:root:             ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:             ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/run.sh from tmp-run.sh
INFO:root:             ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/customize.py
INFO:root:      * cm run script "detect cpu"
INFO:root:        * cm run script "detect os"
INFO:root:               ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:               ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/run.sh from tmp-run.sh
INFO:root:               ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/customize.py
INFO:root:             ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:             ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-cpu/run.sh from tmp-run.sh
INFO:root:             ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-cpu/customize.py
INFO:root:      * cm run script "get python"
INFO:root:           ! load /u1/wk5ng/CM/repos/local/cache/4f9e6fe06cd94940/cm-cached-state.json
INFO:root:Path to Python: /u1/wk5ng/CM/repos/local/cache/7de49cf355794755/mlperf/bin/python3
INFO:root:Python version: 3.11.4
INFO:root:      * cm run script "get mlcommons inference src _deeplearningexamples"
INFO:root:           ! load /u1/wk5ng/CM/repos/local/cache/639bf66ba64d46e2/cm-cached-state.json
INFO:root:      * cm run script "get sut configs"
INFO:root:        * cm run script "get cache dir _name.mlperf-inference-sut-configs"
INFO:root:             ! load /u1/wk5ng/CM/repos/local/cache/182783ed080d4d5b/cm-cached-state.json
INFO:root:             ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/get-mlperf-inference-sut-configs/customize.py
Using MLCommons Inference source from '/u1/wk5ng/CM/repos/local/cache/5584207b30dd4fc1/inference'
Original configuration value 1.0 target_qps
Adjusted configuration value 1.01 target_qps
Output Dir: '/u1/wk5ng/CM/repos/local/cache/087330972af64098/valid_results/watgpu408-reference-cpu-pytorch-v2.4.1-default_config/bert-99/offline/accuracy'
bert.Offline.target_qps = 1.01

INFO:root:    * cm run script "get loadgen"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/23782424b9244dec/cm-cached-state.json
INFO:root:Path to the tool: /u1/wk5ng/CM/repos/local/cache/23782424b9244dec/install
INFO:root:    * cm run script "get mlcommons inference src _deeplearningexamples"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/639bf66ba64d46e2/cm-cached-state.json
INFO:root:    * cm run script "get mlcommons inference src"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/639bf66ba64d46e2/cm-cached-state.json
INFO:root:    * cm run script "get generic-python-lib _package.psutil"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/a86f70abbbe34f04/cm-cached-state.json
INFO:root:    * cm run script "get generic-python-lib _package.pydantic"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/8d44830a5f9f441b/cm-cached-state.json
INFO:root:    * cm run script "get generic-python-lib _tokenization"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/8f488225204f40d0/cm-cached-state.json
INFO:root:    * cm run script "get generic-python-lib _six"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/ab8328266eb24e53/cm-cached-state.json
INFO:root:    * cm run script "get generic-python-lib _package.absl-py"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/0140da6782fc414b/cm-cached-state.json
INFO:root:    * cm run script "get generic-python-lib _boto3"
INFO:root:         ! load /u1/wk5ng/CM/repos/local/cache/12e06773410c4db3/cm-cached-state.json
Using MLCommons Inference source from '/u1/wk5ng/CM/repos/local/cache/5584207b30dd4fc1/inference'
INFO:root:         ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/app-mlperf-inference-mlcommons-python/customize.py
INFO:root:  * cm run script "benchmark-mlperf"
INFO:root:         ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/benchmark-program-mlperf/customize.py
INFO:root:  * cm run script "benchmark-program program"
INFO:root:    * cm run script "detect cpu"
INFO:root:      * cm run script "detect os"
INFO:root:             ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:             ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/run.sh from tmp-run.sh
INFO:root:             ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/customize.py
INFO:root:           ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:           ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-cpu/run.sh from tmp-run.sh
INFO:root:           ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-cpu/customize.py
***************************************************************************
CM script::benchmark-program/run.sh

Run Directory: /u1/wk5ng/CM/repos/local/cache/5584207b30dd4fc1/inference/language/bert

CMD: /u1/wk5ng/CM/repos/local/cache/7de49cf355794755/mlperf/bin/python3 run.py --backend=pytorch --scenario=Offline   --mlperf_conf '/u1/wk5ng/CM/repos/local/cache/5584207b30dd4fc1/inference/mlperf.conf' --user_conf '/u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/generate-mlperf-inference-user-conf/tmp/8de097a9657244f6b779f0466be62ddb.conf' --accuracy 2>&1 ; echo \$? > exitstatus | tee '/u1/wk5ng/CM/repos/local/cache/087330972af64098/valid_results/watgpu408-reference-cpu-pytorch-v2.4.1-default_config/bert-99/offline/accuracy/console.out'

INFO:root:         ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:         ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/benchmark-program/run-ubuntu.sh from tmp-run.sh

/u1/wk5ng/CM/repos/local/cache/7de49cf355794755/mlperf/bin/python3 run.py --backend=pytorch --scenario=Offline   --mlperf_conf '/u1/wk5ng/CM/repos/local/cache/5584207b30dd4fc1/inference/mlperf.conf' --user_conf '/u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/generate-mlperf-inference-user-conf/tmp/8de097a9657244f6b779f0466be62ddb.conf' --accuracy 2>&1 ; echo $? > exitstatus | tee '/u1/wk5ng/CM/repos/local/cache/087330972af64098/valid_results/watgpu408-reference-cpu-pytorch-v2.4.1-default_config/bert-99/offline/accuracy/console.out'
Traceback (most recent call last):
  File "/u1/wk5ng/CM/repos/local/cache/5584207b30dd4fc1/inference/language/bert/run.py", line 19, in <module>
    import mlperf_loadgen as lg
ImportError: /opt/anaconda3/bin/../lib/libstdc++.so.6: version `GLIBCXX_3.4.30' not found (required by /u1/wk5ng/CM/repos/local/cache/23782424b9244dec/install/python/mlperf_loadgen.cpython-311-x86_64-linux-gnu.so)
INFO:root:         ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/benchmark-program/customize.py
INFO:root:  * cm run script "save mlperf inference state"
INFO:root:         ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/save-mlperf-inference-implementation-state/customize.py
INFO:root:       ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:       ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/app-mlperf-inference/run.sh from tmp-run.sh
INFO:root:       ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/app-mlperf-inference/customize.py
INFO:root:* cm run script "get mlperf sut description"
INFO:root:  * cm run script "detect os"
INFO:root:         ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:         ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/run.sh from tmp-run.sh
INFO:root:         ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/customize.py
INFO:root:  * cm run script "detect cpu"
INFO:root:    * cm run script "detect os"
INFO:root:           ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:           ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/run.sh from tmp-run.sh
INFO:root:           ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-os/customize.py
INFO:root:         ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:         ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-cpu/run.sh from tmp-run.sh
INFO:root:         ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/detect-cpu/customize.py
INFO:root:  * cm run script "get python3"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/4f9e6fe06cd94940/cm-cached-state.json
INFO:root:Path to Python: /u1/wk5ng/CM/repos/local/cache/7de49cf355794755/mlperf/bin/python3
INFO:root:Python version: 3.11.4
INFO:root:  * cm run script "get compiler"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/6e0d265390494dc3/cm-cached-state.json
INFO:root:  * cm run script "get generic-python-lib _package.dmiparser"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/1404f366b8494c6e/cm-cached-state.json
INFO:root:  * cm run script "get cache dir _name.mlperf-inference-sut-descriptions"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/b0193ec77b0842dc/cm-cached-state.json
Generating SUT description file for watgpu408-pytorch-2.4.1
INFO:root:       ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/get-mlperf-inference-sut-description/customize.py
INFO:root:* cm run script "run accuracy mlperf _squad _float32"
INFO:root:  * cm run script "get python3"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/4f9e6fe06cd94940/cm-cached-state.json
INFO:root:Path to Python: /u1/wk5ng/CM/repos/local/cache/7de49cf355794755/mlperf/bin/python3
INFO:root:Python version: 3.11.4
INFO:root:  * cm run script "get mlcommons inference src _deeplearningexamples"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/639bf66ba64d46e2/cm-cached-state.json
INFO:root:  * cm run script "get generic-python-lib _boto3"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/12e06773410c4db3/cm-cached-state.json
INFO:root:  * cm run script "get generic-python-lib _package.transformers"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/3a1e6077c1b545df/cm-cached-state.json
INFO:root:  * cm run script "get dataset squad language-processing"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/c4b715419f674d1a/cm-cached-state.json
INFO:root:  * cm run script "get generic-python-lib _torch"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/cf7142057d454f49/cm-cached-state.json
INFO:root:  * cm run script "get generic-python-lib _tokenization"
INFO:root:       ! load /u1/wk5ng/CM/repos/local/cache/8f488225204f40d0/cm-cached-state.json
INFO:root:       ! cd /u1/wk5ng/finetune-inference-experiments
INFO:root:       ! call /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/process-mlperf-accuracy/run.sh from tmp-run.sh
INFO:root:       ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/process-mlperf-accuracy/customize.py

Accuracy file: /u1/wk5ng/CM/repos/local/cache/087330972af64098/valid_results/watgpu408-reference-cpu-pytorch-v2.4.1-default_config/bert-99/offline/accuracy/accuracy.txt

Reading examples...
No cached features at '/u1/wk5ng/CM/repos/local/cache/5584207b30dd4fc1/inference/language/bert/eval_features.pickle'... converting from examples...
Creating tokenizer...
Converting examples to features...
Caching features at '/u1/wk5ng/CM/repos/local/cache/5584207b30dd4fc1/inference/language/bert/eval_features.pickle'...
Loading LoadGen logs...

INFO:root:       ! call "postprocess" from /u1/wk5ng/CM/repos/mlcommons@cm4mlops/script/run-mlperf-inference-app/customize.py

Path to the MLPerf inference benchmark reference sources: /u1/wk5ng/CM/repos/local/cache/5584207b30dd4fc1/inference
Path to the MLPerf inference reference configuration file: /u1/wk5ng/CM/repos/local/cache/5584207b30dd4fc1/inference/mlperf.conf
writingindy commented 1 month ago

Oh copy pasting the logs I see that there's an import error towards the end, hmm

That would explain the lack of results

arjunsuresh commented 1 month ago

Oh. Can you please try to use non conda python? cm rm cache - -tags=get,python -f will remove the currently registered python in CM. After this, cm run script - -tags=get,python will show all available python for CM.

If conda python is used, it'll probably need a compatible libstdc++ installed in the conda env. The below option to the run command might help if it's already installed --env.+LD_LIBRARY_PATH,=/opt/anaconda/lib

arjunsuresh commented 1 month ago

We have now removed the sudo requirement in CM if no installation is needed.