Closed maria-18-git closed 2 months ago
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference$ git clone --recurse-submodules https://github.com/mlcommons/inference --depth 1
...
Receiving objects: 100% (27459/27459), 12.81 MiB | 20.96 MiB/s, done.
Resolving deltas: 100% (20403/20403), done.
Submodule path 'language/bert/DeepLearningExamples/PyTorch/Translation/Transformer/cutlass/tools/external/googletest': checked out '9077ec7efe5b652468ab051e93c67589d5cb8f85'
Submodule path 'vision/medical_imaging/3d-unet-brats19/nnUnet': checked out 'b38c69b345b2f60cd0d053039669e8f988b0c0af'
Directory with the reference code:
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ ls -la
total 100
drwxr-xr-x 2 mmirkina users 4096 Jun 19 06:56 .
drwxr-xr-x 6 mmirkina users 4096 Jun 19 06:56 ..
-rw-r--r-- 1 mmirkina users 342 Jun 19 06:56 build.sh
-rw-r--r-- 1 mmirkina users 3818 Jun 19 06:56 dataset.py
-rw-r--r-- 1 mmirkina users 1907 Jun 19 06:56 Dockerfile
-rw-r--r-- 1 mmirkina users 1811 Jun 19 06:56 Dockerfile.eval
-rw-r--r-- 1 mmirkina users 6565 Jun 19 06:56 evaluate-accuracy.py
-rw-r--r-- 1 mmirkina users 4445 Jun 19 06:56 evaluate_mbxp.py
-rw-r--r-- 1 mmirkina users 1085 Jun 19 06:56 launch.sh
-rw-r--r-- 1 mmirkina users 4490 Jun 19 06:56 main.py
-rw-r--r-- 1 mmirkina users 9328 Jun 19 06:56 README.md
-rw-r--r-- 1 mmirkina users 874 Jun 19 06:56 run_accuracy.sh
-rw-r--r-- 1 mmirkina users 382 Jun 19 06:56 run_offline.sh
-rw-r--r-- 1 mmirkina users 383 Jun 19 06:56 run_server.sh
-rw-r--r-- 1 mmirkina users 16660 Jun 19 06:56 SUT.py
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ cp ../../mlperf.conf .
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ ls -la
total 104
drwxr-xr-x 2 mmirkina users 4096 Jun 19 07:03 .
drwxr-xr-x 6 mmirkina users 4096 Jun 19 06:56 ..
-rw-r--r-- 1 mmirkina users 342 Jun 19 06:56 build.sh
-rw-r--r-- 1 mmirkina users 3818 Jun 19 06:56 dataset.py
-rw-r--r-- 1 mmirkina users 1907 Jun 19 06:56 Dockerfile
-rw-r--r-- 1 mmirkina users 1811 Jun 19 06:56 Dockerfile.eval
-rw-r--r-- 1 mmirkina users 6565 Jun 19 06:56 evaluate-accuracy.py
-rw-r--r-- 1 mmirkina users 4445 Jun 19 06:56 evaluate_mbxp.py
-rw-r--r-- 1 mmirkina users 1085 Jun 19 06:56 launch.sh
-rw-r--r-- 1 mmirkina users 4490 Jun 19 06:56 main.py
-rw-r--r-- 1 mmirkina users 3996 Jun 19 07:03 mlperf.conf
-rw-r--r-- 1 mmirkina users 9328 Jun 19 06:56 README.md
-rw-r--r-- 1 mmirkina users 874 Jun 19 06:56 run_accuracy.sh
-rw-r--r-- 1 mmirkina users 382 Jun 19 06:56 run_offline.sh
-rw-r--r-- 1 mmirkina users 383 Jun 19 06:56 run_server.sh
-rw-r--r-- 1 mmirkina users 16660 Jun 19 06:56 SUT.py
-rw-r--r-- 1 mmirkina users 234 Jun 19 06:56 user.conf
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 --version
Python 3.8.19
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3.9 --version
Python 3.9.19
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.9 1
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ sudo update-alternatives --set python3 /usr/bin/python3.9
update-alternatives: using /usr/bin/python3.9 to provide /usr/bin/python3 (python3) in manual mode
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 --version
Python 3.9.19
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip install pybind11==2.10.4
Defaulting to user installation because normal site-packages is not writeable
Collecting pybind11==2.10.4
Downloading pybind11-2.10.4-py3-none-any.whl (222 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 222.3/222.3 KB 1.3 MB/s eta 0:00:00
Installing collected packages: pybind11
WARNING: The script pybind11-config is installed in '/local/mnt/workspace/mmirkina/.local/bin' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed pybind11-2.10.4
If you want to use CPU:
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip install torch==2.2.0.dev20231006+cpu --index-url https://download.pytorch.org/whl/nightly/cpu
Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://download.pytorch.org/whl/nightly/cpu
ERROR: Could not find a version that satisfies the requirement torch==2.2.0.dev20231006+cpu (from versions: 2.2.0.dev20231010+cpu, 2.4.0.dev20240421+cpu, 2.4.0.dev20240422+cpu, 2.4.0.dev20240423+cpu, 2.4.0.dev20240424+cpu, 2.4.0.dev20240425+cpu, 2.4.0.dev20240426+cpu, 2.4.0.dev20240427+cpu, 2.4.0.dev20240428+cpu, 2.4.0.dev20240429+cpu, 2.4.0.dev20240430+cpu, 2.4.0.dev20240501+cpu, 2.4.0.dev20240502+cpu, 2.4.0.dev20240503+cpu, 2.4.0.dev20240504+cpu, 2.4.0.dev20240505+cpu, 2.4.0.dev20240506+cpu, 2.4.0.dev20240507+cpu, 2.4.0.dev20240508+cpu, 2.4.0.dev20240509+cpu, 2.4.0.dev20240510+cpu, 2.4.0.dev20240511+cpu, 2.4.0.dev20240512+cpu, 2.4.0.dev20240513+cpu, 2.4.0.dev20240514+cpu, 2.4.0.dev20240515+cpu, 2.4.0.dev20240516+cpu, 2.4.0.dev20240517+cpu, 2.4.0.dev20240518+cpu, 2.4.0.dev20240519+cpu, 2.4.0.dev20240520+cpu, 2.4.0.dev20240521+cpu, 2.4.0.dev20240522+cpu, 2.4.0.dev20240523+cpu, 2.4.0.dev20240524+cpu, 2.4.0.dev20240525+cpu, 2.4.0.dev20240526+cpu, 2.4.0.dev20240527+cpu, 2.4.0.dev20240528+cpu, 2.4.0.dev20240529+cpu, 2.4.0.dev20240530+cpu, 2.4.0.dev20240531+cpu, 2.4.0.dev20240601+cpu, 2.4.0.dev20240602+cpu, 2.4.0.dev20240603+cpu, 2.4.0.dev20240604+cpu, 2.4.0.dev20240605+cpu, 2.4.0.dev20240606+cpu, 2.4.0.dev20240607+cpu, 2.4.0.dev20240608+cpu, 2.4.0.dev20240609+cpu, 2.4.0.dev20240610+cpu, 2.4.0.dev20240611+cpu, 2.4.0.dev20240612+cpu, 2.5.0.dev20240613+cpu, 2.5.0.dev20240614+cpu, 2.5.0.dev20240615+cpu, 2.5.0.dev20240616+cpu, 2.5.0.dev20240617+cpu, 2.5.0.dev20240618+cpu, 2.5.0.dev20240619+cpu)
ERROR: No matching distribution found for torch==2.2.0.dev20231006+cpu
We don't have this version
torch==2.2.0.dev20231006+cpu
Now we have only these
torch-2.2.0.dev20231010+cpu.cxx11.abi-cp310-cp310-linux_x86_64.whl
torch-2.2.0.dev20231010+cpu.cxx11.abi-cp311-cp311-linux_x86_64.whl
torch-2.2.0.dev20231010+cpu.cxx11.abi-cp38-cp38-linux_x86_64.whl
torch-2.2.0.dev20231010+cpu.cxx11.abi-cp39-cp39-linux_x86_64.whl
in https://download.pytorch.org/whl/nightly/torch/ We use torch-2.2.0.dev20231010+cpu
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip install torch==2.2.0.dev20231010+cpu --index-url https://download.pytorch.org/whl/nightly/cpu
Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://download.pytorch.org/whl/nightly/cpu
Collecting torch==2.2.0.dev20231010+cpu
Downloading https://download.pytorch.org/whl/nightly/cpu/torch-2.2.0.dev20231010%2Bcpu-cp39-cp39-linux_x86_64.whl (185.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 185.1/185.1 MB 10.5 MB/s eta 0:00:00
Requirement already satisfied: typing-extensions in /local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages (from torch==2.2.0.dev20231010+cpu) (4.4.0)
Requirement already satisfied: networkx in /local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages (from torch==2.2.0.dev20231010+cpu) (3.0)
Requirement already satisfied: filelock in /local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages (from torch==2.2.0.dev20231010+cpu) (3.9.0)
Requirement already satisfied: sympy in /local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages (from torch==2.2.0.dev20231010+cpu) (1.12)
Requirement already satisfied: fsspec in /local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages (from torch==2.2.0.dev20231010+cpu) (2023.12.2)
Requirement already satisfied: jinja2 in /usr/lib/python3/dist-packages (from torch==2.2.0.dev20231010+cpu) (3.0.3)
Requirement already satisfied: mpmath>=0.19 in /local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages (from sympy->torch==2.2.0.dev20231010+cpu) (1.3.0)
Installing collected packages: torch
Attempting uninstall: torch
Found existing installation: torch 2.1.2+cpu
Uninstalling torch-2.1.2+cpu:
Successfully uninstalled torch-2.1.2+cpu
WARNING: The scripts convert-caffe2-to-onnx, convert-onnx-to-caffe2 and torchrun are installed in '/local/mnt/workspace/mmirkina/.local/bin' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
torchvision 0.16.2+cpu requires torch==2.1.2, but you have torch 2.2.0.dev20231010+cpu which is incompatible.
torchaudio 2.1.2+cpu requires torch==2.1.2, but you have torch 2.2.0.dev20231010+cpu which is incompatible.
Successfully installed torch-2.2.0.dev20231010+cpu
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip install transformers==4.31.0 nltk==3.8.1 evaluate==0.4.0 absl-py==1.4.0 rouge-score==0.1.2 sentencepiece==0.1.99 accelerate==0.21.0
...
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
torchvision 0.16.2+cpu requires torch==2.1.2, but you have torch 2.2.0.dev20231010+cpu which is incompatible.
Successfully installed absl-py-1.4.0 accelerate-0.21.0 aiohttp-3.9.5 aiosignal-1.3.1 async-timeout-4.0.3 charset-normalizer-3.3.2 datasets-2.20.0 dill-0.3.8 evaluate-0.4.0 frozenlist-1.4.1 huggingface-hub-0.23.4 joblib-1.4.2 multidict-6.0.5 multiprocess-0.70.16 nltk-3.8.1 pyarrow-16.1.0 pyarrow-hotfix-0.6 requests-2.32.3 responses-0.18.0 rouge-score-0.1.2 sentencepiece-0.1.99 tokenizers-0.13.3 tqdm-4.66.4 transformers-4.31.0 xxhash-3.4.1 yarl-1.9.4
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip install git+https://github.com/amazon-science/mxeval.git@e09974f990eeaf0c0e8f2b5eaff4be66effb2c86
...
Successfully built mxeval fire
Installing collected packages: termcolor, fire, mxeval
ERROR: For req: mxeval==1.0. Invalid script entry point: <ExportEntry evaluate_functional_correctness = mxeval.evaluate_functional_correctness:None []> - A callable suffix is required. Cf https://packaging.python.org/specifications/entry-points/#use-for-scripts for more information.
If run on CPU:
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip show torch
Name: torch
Version: 2.2.0.dev20231010+cpu
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: packages@pytorch.org
License: BSD-3
Location: /local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages
Requires: filelock, fsspec, jinja2, networkx, sympy, typing-extensions
Required-by: accelerate, torchaudio, torchvision
If run on GPU:
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip install torch
...
Successfully installed nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-8.9.2.26 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.20.5 nvidia-nvjitlink-cu12-12.5.40 nvidia-nvtx-cu12-12.1.105 torch-2.3.1 triton-2.3.1 typing-extensions-4.12.2
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip show torch
Name: torch
Version: 2.3.1
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: packages@pytorch.org
License: BSD-3
Location: /local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages
Requires: filelock, fsspec, jinja2, networkx, nvidia-cublas-cu12, nvidia-cuda-cupti-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-runtime-cu12, nvidia-cudnn-cu12, nvidia-cufft-cu12, nvidia-curand-cu12, nvidia-cusolver-cu12, nvidia-cusparse-cu12, nvidia-nccl-cu12, nvidia-nvtx-cu12, sympy, triton, typing-extensions
Required-by: accelerate, torchaudio, torchvision
For running experiments we also need pandas:
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip show pandas
WARNING: Package(s) not found: pandas
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip install pandas
Defaulting to user installation because normal site-packages is not writeable
Collecting pandas
Downloading pandas-2.2.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.1/13.1 MB 21.2 MB/s eta 0:00:00
Requirement already satisfied: pytz>=2020.1 in /usr/lib/python3/dist-packages (from pandas) (2022.1)
Collecting tzdata>=2022.7
Downloading tzdata-2024.1-py2.py3-none-any.whl (345 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 345.4/345.4 KB 36.8 MB/s eta 0:00:00
Requirement already satisfied: numpy>=1.22.4 in /local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages (from pandas) (1.24.1)
Collecting python-dateutil>=2.8.2
Downloading python_dateutil-2.9.0.post0-py2.py3-none-any.whl (229 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 229.9/229.9 KB 19.8 MB/s eta 0:00:00
Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil>=2.8.2->pandas) (1.16.0)
Installing collected packages: tzdata, python-dateutil, pandas
Successfully installed pandas-2.2.2 python-dateutil-2.9.0.post0 tzdata-2024.1
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip install transformers
...
Successfully installed tokenizers-0.19.1 transformers-4.41.2
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ export CUR_DIR=${PWD}
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ cd ../../loadgen/
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/loadgen$ python3 -m pip install .
Defaulting to user installation because normal site-packages is not writeable
Processing /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/loadgen
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Building wheels for collected packages: mlperf_loadgen
Building wheel for mlperf_loadgen (pyproject.toml) ... done
Created wheel for mlperf_loadgen: filename=mlperf_loadgen-4.0-cp39-cp39-linux_x86_64.whl size=418285 sha256=714f5348ab9db3d520b72bf3c333a038787394cd31586e4b77a77f3a065f9e16
Stored in directory: /tmp/pip-ephem-wheel-cache-uo3qp3wa/wheels/35/c2/51/339102eab2197cf953ad0a1e30c6fca1f22390f8702f2e0b21
Successfully built mlperf_loadgen
Installing collected packages: mlperf_loadgen
Successfully installed mlperf_loadgen-4.0
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ sudo -v ; curl https://rclone.org/install.sh | sudo bash
Enter password for mmirkina (QUALPASS):
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 4734 100 4734 0 0 5488 0 --:--:-- --:--:-- --:--:-- 5485
The latest version of rclone rclone v1.67.0 is already installed.
- run the following command to authenticate with the bucket
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ rclone config create mlc-inference s3 provider=Cloudflare access_key_id=f65ba5eef400db161ea49967de89f47b secret_access_key=fbea333914c292b854f14d3fe232bad6c5407bf0ab1bebf78833c2b359bdfd2b endpoint=https://c2686074cb2caf5cbaf6d134bdba8b47.r2.cloudflarestorage.com [mlc-inference] type = s3 access_key_id = f65ba5eef400db161ea49967de89f47b secret_access_key = fbea333914c292b854f14d3fe232bad6c5407bf0ab1bebf78833c2b359bdfd2b endpoint = https://c2686074cb2caf5cbaf6d134bdba8b47.r2.cloudflarestorage.com provider = Cloudflare
- download the model checkpoint
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624$ time rclone copy mlc-inference:mlcommons-inference-wg-public/mixtral_8x7b/mixtral-8x7b-instruct-v0.1 ./mixtral-8x7b-instruct-v0.1 -P Transferred: 173.982 GiB / 173.982 GiB, 100%, 18.288 MiB/s, ETA 0s Transferred: 42 / 42, 100% Elapsed time: 36m6.7s
real 36m6.834s user 11m57.130s sys 8m50.341s
Results:
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624$ ls -la mixtral-8x7b-instruct-v0.1/ total 182433204 drwxr-xr-x 2 mmirkina docker 4096 Jun 27 09:38 . drwxr-xr-x 3 mmirkina docker 4096 Jun 27 09:02 .. -rw-r--r-- 1 mmirkina docker 803 Jun 24 17:04 config.json -rw-r--r-- 1 mmirkina docker 111 Jun 24 17:04 generation_config.json -rw-r--r-- 1 mmirkina docker 4920052720 Jun 24 17:04 model-00001-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:04 model-00002-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:04 model-00003-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:04 model-00004-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:04 model-00005-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4932504264 Jun 24 17:05 model-00006-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559912 Jun 24 17:05 model-00007-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:05 model-00008-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:05 model-00009-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:06 model-00010-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:06 model-00011-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4999646240 Jun 24 17:06 model-00012-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4798417968 Jun 24 17:06 model-00013-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:07 model-00014-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:07 model-00015-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:07 model-00016-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:07 model-00017-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:08 model-00018-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4932504280 Jun 24 17:08 model-00019-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:08 model-00020-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:08 model-00021-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:09 model-00022-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:09 model-00023-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:09 model-00024-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4932504280 Jun 24 17:09 model-00025-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:10 model-00026-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:10 model-00027-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:10 model-00028-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:10 model-00029-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:11 model-00030-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4932504280 Jun 24 17:11 model-00031-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:11 model-00032-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:11 model-00033-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:12 model-00034-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:12 model-00035-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:12 model-00036-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4999646264 Jun 24 17:12 model-00037-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4798417968 Jun 24 17:13 model-00038-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 1463862216 Jun 24 17:13 model-00039-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 92659 Jun 24 17:13 model.safetensors.index.json
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference$ mkdir dataset
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference$ chmod 775 dataset
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference$ cd dataset/
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/dataset$ sudo -v ; curl https://rclone.org/install.sh | sudo bash
Enter password for mmirkina (QUALPASS):
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 4734 100 4734 0 0 6199 0 --:--:-- --:--:-- --:--:-- 6196
The latest version of rclone rclone v1.67.0 is already installed.
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/dataset$ rclone copyurl https://inference.mlcommons-storage.org/mixtral_8x7b%2F2024.06.06_mixtral_15k_v4.pkl ./ -a -P
Transferred: 68.439 MiB / 68.439 MiB, 100%, 46.237 MiB/s, ETA 0s
Transferred: 1 / 1, 100%
Elapsed time: 1.8s
We don't need calibration dataset for accuracy/performance running.
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ time python3 -u main.py --scenario Offline --mlperf-conf mlperf.conf --user-conf user.conf --total-sample-count 15 --model-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/ --dataset-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/dataset/2024.06.06_mixtral_15k_v4.pkl --output-log-dir offline-logs --dtype float32 --device cuda:0 2>&1 | tee offline_performance_log.log
Traceback (most recent call last):
File "/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b/main.py", line 168, in <module>
main()
File "/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b/main.py", line 135, in main
sut = sut_cls(
File "/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b/SUT.py", line 152, in __init__
self.data_object = Dataset(self.model_path,
File "/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b/dataset.py", line 31, in __init__
self.load_tokenizer()
File "/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b/dataset.py", line 39, in load_tokenizer
self.tokenizer = AutoTokenizer.from_pretrained(
File "/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py", line 902, in from_pretrained
return tokenizer_class_py.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 2094, in from_pretrained
raise EnvironmentError(
OSError: Can't load tokenizer for '/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure '/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/' is the correct path to a directory containing all relevant files for a LlamaTokenizer tokenizer.
real 0m6.223s
user 0m3.351s
sys 0m9.660s
The reason of this issue is missing of tokenizer files in downloaded model checkpoint:
tokenizer.json
tokenizer.model
tokenizer_config.json
These files are located in https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1/tree/main
So login to huggingface (https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1/tree/main) and download these files to Windows.
Then open cmd
and copy to apollo
using scp`:
C:\Users\mmirkina\Downloads>scp tokenizer* mmirkina@aus655-apollo-0:
...
tokenizer.json 100% 1753KB 1.6MB/s 00:01
tokenizer.model 100% 482KB 3.4MB/s 00:00
tokenizer_config.json 100% 1466 12.4KB/s 00:00
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1$ cp /usr2/mmirkina/token* ./
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1$ ls -la
total 182435448
drwxr-xr-x 2 mmirkina docker 4096 Jun 27 17:41 .
drwxr-xr-x 4 mmirkina docker 4096 Jun 27 17:20 ..
-rw-r--r-- 1 mmirkina docker 803 Jun 24 17:04 config.json
-rw-r--r-- 1 mmirkina docker 111 Jun 24 17:04 generation_config.json
-rw-r--r-- 1 mmirkina docker 4920052720 Jun 24 17:04 model-00001-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:04 model-00002-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:04 model-00003-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:04 model-00004-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:04 model-00005-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4932504264 Jun 24 17:05 model-00006-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559912 Jun 24 17:05 model-00007-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:05 model-00008-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:05 model-00009-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:06 model-00010-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:06 model-00011-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4999646240 Jun 24 17:06 model-00012-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4798417968 Jun 24 17:06 model-00013-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:07 model-00014-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:07 model-00015-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:07 model-00016-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:07 model-00017-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:08 model-00018-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4932504280 Jun 24 17:08 model-00019-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:08 model-00020-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:08 model-00021-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:09 model-00022-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:09 model-00023-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:09 model-00024-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4932504280 Jun 24 17:09 model-00025-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:10 model-00026-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:10 model-00027-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:10 model-00028-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:10 model-00029-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:11 model-00030-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4932504280 Jun 24 17:11 model-00031-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:11 model-00032-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:11 model-00033-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:12 model-00034-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:12 model-00035-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:12 model-00036-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4999646264 Jun 24 17:12 model-00037-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4798417968 Jun 24 17:13 model-00038-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 1463862216 Jun 24 17:13 model-00039-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 92659 Jun 24 17:13 model.safetensors.index.json
-rw-r--r-- 1 mmirkina docker 1466 Jun 27 17:41 tokenizer_config.json
-rw-r--r-- 1 mmirkina docker 1795303 Jun 27 17:41 tokenizer.json
-rw-r--r-- 1 mmirkina docker 493443 Jun 27 17:41 tokenizer.model
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ time python3 -u main.py --scenario Offline --mlperf-conf mlperf.conf --user-conf user.conf --total-sample-count 15 --model-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/ --dataset-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/dataset/2024.06.06_mixtral_15k_v4.pkl --output-log-dir offline-logs --dtype float32 --device cuda:0 2>&1 | tee offline_performance_log.log
...
Loading dataset...
/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/torch/cuda/__init__.py:619: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
Finished loading dataset.
Loading checkpoint shards: 100%|██████████| 39/39 [00:57<00:00, 1.47s/it]
Loaded model
Loaded tokenizer
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Starting Benchmark run
IssueQuery started with 15000 samples
/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/transformers/generation/configuration_utils.py:563: UserWarning: `num_beams` is set to 1. However, `early_stopping` is set to `True` -- this flag
is only used in beam-based generation modes. You should set `num_beams>1` or unset `early_stopping`.
warnings.warn(
IssueQuery done
Saving outputs to run_outputs/q13.pkl
Samples run: 1
BatchMaker time: 0.031223773956298828
Inference time: 139.80554151535034
Postprocess time: 0.0006673336029052734
==== Total time: 139.83743262290955
Saving outputs to run_outputs/q11.pkl
Samples run: 2
BatchMaker time: 0.00020241737365722656
Inference time: 338.96012592315674
Postprocess time: 0.000946044921875
==== Total time: 338.96127438545227
Saving outputs to run_outputs/q10.pkl
Samples run: 3
BatchMaker time: 0.00020623207092285156
Inference time: 378.08090806007385
Postprocess time: 0.0004837512969970703
==== Total time: 378.0815980434418
Saving outputs to run_outputs/q7.pkl
Samples run: 4
BatchMaker time: 0.0001933574676513672
Inference time: 142.67389917373657
Postprocess time: 0.0005340576171875
==== Total time: 142.6746265888214
Saving outputs to run_outputs/q5.pkl
...
Samples run: 116
BatchMaker time: 0.0001952648162841797
Inference time: 138.5476894378662
Postprocess time: 0.0007069110870361328
==== Total time: 138.54859161376953
Saving outputs to run_outputs/q3.pkl
Samples run: 117
BatchMaker time: 0.00021195411682128906
Inference time: 255.1372139453888
Postprocess time: 0.0006382465362548828
==== Total time: 255.13806414604187
^C
^C
^C
^C
real 587m51.182s
user 585m56.346s
it was stopped because full experiment took a lot of time.
Setting of --total-sample-count 15
as input parameter didn't influence for number of samples in Performance mode.
Comment about it:
https://github.com/mlcommons/inference/blob/master/language/mixtral-8x7b/main.py#L67
Setting of --total-sample-count
for accuracy experiments works correctly.
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ OUTPUT_LOG_DIR=offline-accuracy-logs
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ mkdir -p "run_outputs"
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ time python3 -u main.py --scenario Offline --model-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/ --accuracy --mlperf-conf mlperf.conf --user-conf user.conf --total-sample-count 10 --dataset-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/dataset/2024.06.06_mixtral_15k_v4.pkl --output-log-dir ${OUTPUT_LOG_DIR} --device cuda:0
...
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ time python3 -u main.py --scenario Offline --model-path /local/mnt/workspace/mmirkina/mixtral_8x7b_r
eference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/ --accuracy --mlperf-conf mlperf.conf --user-conf user.conf --total-sample-count 10 --dataset-path /local/mnt/workspace/mmirkina/mixtral_8x7
b_reference/dataset/2024.06.06_mixtral_15k_v4.pkl --output-log-dir ${OUTPUT_LOG_DIR} --device cuda:0
WARNING:Mixtral-8x7B-Instruct-v0.1-MAIN:Accuracy run will generate the accuracy logs, but the evaluation of the log is not completed yet
Loading dataset...
/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/torch/cuda/__init__.py:619: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
Finished loading dataset.
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 39/39 [00:19<00:00, 2.00it/s]
Loaded model
Loaded tokenizer
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Starting Benchmark run
IssueQuery started with 10 samples
IssueQuery done
/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/transformers/generation/configuration_utils.py:563: UserWarning: `num_beams` is set to 1. However, `early_stopping` is set to `True` -- this flag
is only used in beam-based generation modes. You should set `num_beams>1` or unset `early_stopping`.
warnings.warn(
.Saving outputs to run_outputs/q8.pkl
Samples run: 1
BatchMaker time: 0.0005521774291992188
Inference time: 760.4167733192444
Postprocess time: 0.0010259151458740234
==== Total time: 760.4183514118195
Saving outputs to run_outputs/q9.pkl Samples run: 2 BatchMaker time: 0.0002186298370361328
Inference time: 255.86129760742188
Postprocess time: 0.0005745887756347656
==== Total time: 255.86209082603455
Saving outputs to run_outputs/q7.pkl
Samples run: 3
BatchMaker time: 0.0002162456512451172
Inference time: 143.11561179161072
Postprocess time: 0.0005106925964355469
==== Total time: 143.1163387298584
Saving outputs to run_outputs/q6.pkl
Samples run: 4
BatchMaker time: 0.00020933151245117188
Inference time: 179.33479189872742
Postprocess time: 0.0005283355712890625
==== Total time: 179.33552956581116
Saving outputs to run_outputs/q0.pkl
Samples run: 5
BatchMaker time: 0.00020623207092285156
Inference time: 302.74939727783203
Postprocess time: 0.0005166530609130859
==== Total time: 302.75012016296387
Saving outputs to run_outputs/q1.pkl
Samples run: 6
BatchMaker time: 0.00020122528076171875
Inference time: 176.75450086593628
Postprocess time: 0.0005507469177246094
==== Total time: 176.75525283813477
Saving outputs to run_outputs/q4.pkl
Samples run: 7
BatchMaker time: 0.0002200603485107422
Inference time: 138.0684790611267
Postprocess time: 0.0005571842193603516
==== Total time: 138.06925630569458
Saving outputs to run_outputs/q5.pkl
Samples run: 8
BatchMaker time: 0.00020170211791992188
Inference time: 154.93840098381042
Postprocess time: 0.0005500316619873047
==== Total time: 154.93915271759033
Saving outputs to run_outputs/q3.pkl
Samples run: 9
BatchMaker time: 0.00019693374633789062
Inference time: 254.17931580543518
Postprocess time: 0.0005474090576171875
==== Total time: 254.18006014823914
Saving outputs to run_outputs/q2.pkl
Samples run: 10
BatchMaker time: 0.0002086162567138672
Inference time: 340.03823614120483
Postprocess time: 0.0005693435668945312
==== Total time: 340.03901410102844
No warnings encountered during test.
No errors encountered during test.
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Run Completed!
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Destroying SUT...
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Destroying QSL...
real 45m41.087s
user 45m36.972s
sys 0m37.635s
But we have an issue when we run evaluate-accuracy.py
for accuracy getting(some debug printing added).
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ time python3 evaluate-accuracy.py --checkpoint-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mi
xtral-8x7b-instruct-v0.1/ --mlperf-accuracy-file ${ACCURACY_LOG_FILE} --dataset-file /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/dataset/2024.06.06_mixtral_15k_v4.pkl --dtype int32
/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/torch/cuda/__init__.py:619: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
[nltk_data] Downloading package punkt to
[nltk_data] /local/mnt/workspace/mmirkina/nltk_data...
[nltk_data] Package punkt is already up-to-date!
DEBUG: data = dataset id question input ... stop_sequence tok_stop_sequence tok_input_len tok_ref_output_len
0 GSM8K train.548 Gary manages two Amazon distribution centers. ... <s> [INST] As an expert problem solver solve s... ... </s> [2] 657 174
1 GSM8K train.6592 The square footage of the two bedrooms in the ... <s> [INST] As an expert problem solver solve s... ... </s> [2] 657 118
2 GSM8K train.6644 Thomas, Toby, and Rebecca worked a total of 15... <s> [INST] As an expert problem solver solve s... ... </s> [2] 662 224
3 GSM8K train.3596 Two-thirds of the class have brown eyes. Half ... <s> [INST] As an expert problem solver solve s... ... </s> [2] 648 96
4 GSM8K train.5034 Jackie spends 8 hours working, 3 hours of exer... <s> [INST] As an expert problem solver solve s... ... </s> [2] 634 75
... ... ... ... ... ... ... ... ... ...
14995 MBXP javascript_sumDigitsTwoparts /**\n * * Write a function to divide a number ... <s> [INST] Complete the following code. Be con... ... \n```\n [13, 13940, 28832, 13] 137 284
14996 MBXP javascript_palindromeLambda /**\n * * Write a function to find palindromes... <s> [INST] Complete the following code. Be con... ... \n```\n [13, 13940, 28832, 13] 192 38
14997 MBXP javascript_removeTuples /**\n * * Write a function to remove all the t... <s> [INST] Complete the following code. Be con... ... \n```\n [13, 13940, 28832, 13] 282 35
14998 MBXP javascript_posNos /**\n * * Write a JavaScript function to print... <s> [INST] Complete the following code. Be con... ... \n```\n [13, 13940, 28832, 13] 142 31
14999 MBXP javascript_tupleToDict /**\n * * Write a function to convert the give... <s> [INST] Complete the following code. Be con... ... \n```\n [13, 13940, 28832, 13] 208 75
[15000 rows x 12 columns]
DEBUG: results(mlperf_accuracy_file) = [{'seq_id': 0, 'qsl_idx': 8, 'data': '610C00000000000046700000000000002970000000000000CF38000000000000100100000000000030100000000000002E010000000000000801000000000000D82C00000000000086010000000000003E0100000000000
03001000000000000100100000000000030100000000000002E0100000000000008010000000000004C170000000000002E01000000000000514100000000000086010000000000006F01000000000000337000000000000021700000000000000D000000000000000D00000000000000480D000000000000100100000000
00008C0A0000000000003570000000000000DE010000000000006903000000000000710100000000000021700000000000004E700000000000003F7000000000000088020000000000002170000000000000627000000000000051700000000000004701000000000000217000000000000044700000000000004E7000000
00000003E70000000000000337000000000000021700000000000000D000000000000000D00000000000000351D00000000000070010000000000009D1000000000000020020000000000002170000000000000627000000000000092310000000000002E0100000000000051410000000000003570000000000000700100
00000000008E020000000000008A4E0000000000001E0100000000000021700000000000004E700000000000006E7000000000000097700000000000002E01000000000000FF020000000000007001000000000000362A00000000000013170000000000006201000000000000C2020000000000003370000000000000530
3000000000000090B000000000000710100000000000021700000000000003E7000000000000033700000000000004E700000000000006E700000000000009E010000000000004070000000000000217000000000000062700000000000005170000000000000470100000000000021700000000000003E70000000000000
337000000000000073700000000000006E70000000000000517000000000000093010000000000008A4E0000000000001E01000000000000337000000000000021700000000000000D000000000000000D000000000000001B5E00000000000010010000000000001E0C000000000000E60D0000000000007001000000000
00013170000000000009301000000000000217000000000000044700000000000004E700000000000003E7000000000000035700000000000001001000000000000E60D000000000000700100000000000013170000000000006201000000000000100100000000000051410000000000002C06000000000000FA01000000
000000EE02000000000000217000000000000044700000000000004E700000000000003E70000000000000830100000000000021700000000000004E700000000000003F70000000000000337000000000000021700000000000000D000000000000000D00000000000000161400000000000035700000000000002170000
0000000003E70000000000000337000000000000073700000000000006E7000000000000051700000000000004701000000000000217000000000000044700000000000004E700000000000003E70000000000000830100000000000021700000000000004E700000000000003F7000000000000033700000000000002170
0000000000000D000000000000000D000000000000001F21000000000000DE01000000000000FA01000000000000DD03000000000000D5300000000000008B01000000000000DD03000000000000DD220000000000004B700000000000000D000000000000000D000000000000004E700000000000003F700000000000008
8020000000000002170000000000000627000000000000051700000000000004701000000000000217000000000000044700000000000004E700000000000003E700000000000000D000000000000003E70000000000000337000000000000073700000000000006E70000000000000517000000000000047010000000000
00217000000000000044700000000000004E700000000000003E70000000000000830100000000000021700000000000004E700000000000003F700000000000000D000000000000000D0000000000000014090000000000001D02000000000000112F000000000000C80100000000000033060000000000002E010000000
00000D5300000000000002A0100000000000014050000000000001001000000000000FD0B0000000000002E010000000000003E0100000000000030010000000000006F01000000000000337000000000000021700000000000000D000000000000000D00000000000000411D000000000000357000000000000042050000
000000004670000000000000297000000000000031260000000000007C010000000000003E01000000000000290100000000000010010000000000008C0600000000000049220000000000004B700000000000000D000000000000000D000000000000003F700000000000004701000000000000450100000000000044700
000000000004E700000000000003E70000000000000830100000000000021700000000000003E70000000000000337000000000000073700000000000006E7000000000000051700000000000003B70000000000000DC0200000000000021700000000000004E700000000000000D000000000000000D000000000000001F
210000000000003570000000000000435D000000000000C801000000000000961600000000000062010000000000003E010000000000000A0300000000000010010000000000008B0300000000000049220000000000004B700000000000000D000000000000000D000000000000004E70000000000000580700000000000
044700000000000004E700000000000003E70000000000000830100000000000021700000000000003E70000000000000337000000000000073700000000000006E7000000000000051700000000000003B70000000000000DC0200000000000021700000000000004E700000000000003B70000000000000880200000000
00002170000000000000627000000000000051700000000000004701000000000000217000000000000044700000000000004E700000000000003E700000000000000D000000000000000D00000000000000821D000000000000C4010000000000002706000000000000C80100000000000049220000000000004B7000000
00000000D000000000000000D0000000000000044700000000000004E700000000000003E70000000000000830100000000000021700000000000003E70000000000000337000000000000073700000000000006E700000000000005170000000000000880200000000000021700000000000006270000000000000517000
00000000004701000000000000217000000000000044700000000000004E700000000000003E700000000000000D000000000000000D000000000000005159000000000000D901000000000000E1020000000000008F0D0000000000004B700000000000000D000000000000000D000000000000004E70000000000000337
00000000000004E700000000000006E700000000000005170000000000000470100000000000021700000000000003E700000000000000D000000000000000D00000000000000C331000000000000230200000000000018060000000000002021000000000000E60100000000000021700000000000004E70000000000000
33700000000000004E700000000000006E700000000000004B700000000000000D000000000000000D000000000000005170000000000000470100000000000021700000000000003E700000000000000D000000000000000D000000000000001F210000000000003570000000000000435D0000000000006F01000000000
000470100000000000021700000000000003E700000000000000A030000000000001001000000000000492200000000000062010000000000003E010000000000004B700000000000000D000000000000000D000000000000003F700000000000004701000000000000450100000000000044700000000000004E70000000
0000003E70000000000000830100000000000021700000000000003E70000000000000337000000000000073700000000000006E700000000000009E01000000000000407000000000000021700000000000003E700000000000003B70000000000000DC0200000000000021700000000000004E700000000000000D00000
0000000000D00000000000000821D000000000000C4010000000000002706000000000000C80100000000000049220000000000004B700000000000000D000000000000000D000000000000003F700000000000004701000000000000217000000000000044700000000000004E700000000000003E70000000000000DC02
00000000000021700000000000004E700000000000000D000000000000000D00000000000000821D000000000000C4010000000000002706000000000000C80100000000000049220000000000004B700000000000000D000000000000000D000000000000003F70000000000000470100000000000021700000000000007
0700000000000003E700000000000000D000000000000000D0000000000000016140000000000003570000000000000100100000000000030100000000000002E010000000000000801000000000000D82C0000000000005D010000000000003E010000000000004701000000000000217000000000000070700000000000
003E7000000000000033700000000000009F0100000000000014110000000000005D01000000000000217000000000000070700000000000003E7000000000000033700000000000000200000000000000', 'token_count': 448}, {'seq_id': 1, 'qsl_idx': 9, 'data': '
...
[3:31](https://dividiti.slack.com/archives/C0258V8LLRJ/p1719844312114499)
DEBUG: query_type = GSM8K
DEBUG: query_type = GSM8K
DEBUG: query_type = GSM8K
DEBUG: query_type = GSM8K
DEBUG: query_type = GSM8K
DEBUG: query_type = GSM8K
DEBUG: query_type = GSM8K
DEBUG: query_type = GSM8K
DEBUG: query_type = GSM8K
DEBUG: query_type = GSM8K
DEBUG: preds_token_OpenOrca = []
DEBUG: preds_decoded_text = []
DEBUG: target_required_OpenOrca = []
DEBUG: preds, targets = [] []
DEBUG: model: dict_items([('predictions', []), ('references', [])])
Traceback (most recent call last):
File "/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b/evaluate-accuracy.py", line 221, in <module>
main()
File "/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b/evaluate-accuracy.py", line 182, in main
result = metric.compute(
File "/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/evaluate/module.py", line 432, in compute
self.add_batch(**inputs)
File "/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/evaluate/module.py", line 480, in add_batch
self.selected_feature_format = self._infer_feature_from_batch(batch)
File "/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/evaluate/module.py", line 552, in _infer_feature_from_batch
example = dict([(k, v[0]) for k, v in batch.items()])
File "/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/evaluate/module.py", line 552, in <listcomp>
example = dict([(k, v[0]) for k, v in batch.items()])
IndexError: list index out of range
real 0m8.781s
user 0m5.032s
sys 0m9.812s
The reason of this issue is full dataset using but results for short run(--total-sample-count 10
). So in evaluate-accuracy.py
we should use all 3 type of dataset (GSM8K
, Open Orca
and MBXP
). But we have results only for 10 samples of GSM8K
.
Full dataset contains
5000 GSM8K
samples,
5000 Open Orca
samples ,
5000 MBXP
samples.
If I commented code for Open Orca and MBXP samples we can have:
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 evaluate-accuracy.py --checkpoint-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/ --mlperf-accuracy-file ${ACCURACY_LOG_FILE} --dataset-file /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/dataset/2024.06.06_mixtral_15k_v4.pkl --dtype int32
...
/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/torch/cuda/__init__.py:619: UserWarning: Can't initialize NVML warnings.warn("Can't initialize NVML") [nltk_data] Downloading package punkt to
[nltk_data] /local/mnt/workspace/mmirkina/nltk_data... [nltk_data] Package punkt is already up-to-date! DEBUG: data = dataset id question input ... stop_sequence tok_stop_sequence tok_inp
ut_len tok_ref_output_len
0 GSM8K train.548 Gary manages two Amazon distribution centers. ... <s> [INST] As an expert problem solver solve s... ... </s> [2] 657
174
1 GSM8K train.6592 The square footage of the two bedrooms in the ... <s> [INST] As an expert problem solver solve s... ... </s> [2] 657
118
2 GSM8K train.6644 Thomas, Toby, and Rebecca worked a total of 15... <s> [INST] As an expert problem solver solve s... ... </s> [2] 662
224
3 GSM8K train.3596 Two-thirds of the class have brown eyes. Half ... <s> [INST] As an expert problem solver solve s... ... </s> [2] 648
96
4 GSM8K train.5034 Jackie spends 8 hours working, 3 hours of exer... <s> [INST] As an expert problem solver solve s... ... </s> [2] 634
75
... ... ... ... ... ... ... ... ...
...
14995 MBXP javascript_sumDigitsTwoparts /**\n * * Write a function to divide a number ... <s> [INST] Complete the following code. Be con... ... \n```\n [13, 13940, 28832, 13] 137
284
14996 MBXP javascript_palindromeLambda /**\n * * Write a function to find palindromes... <s> [INST] Complete the following code. Be con... ... \n```\n [13, 13940, 28832, 13] 192
38
14997 MBXP javascript_removeTuples /**\n * * Write a function to remove all the t... <s> [INST] Complete the following code. Be con... ... \n```\n [13, 13940, 28832, 13] 282
35
14998 MBXP javascript_posNos /**\n * * Write a JavaScript function to print... <s> [INST] Complete the following code. Be con... ... \n```\n [13, 13940, 28832, 13] 142
31
14999 MBXP javascript_tupleToDict /**\n * * Write a function to convert the give... <s> [INST] Complete the following code. Be con... ... \n```\n [13, 13940, 28832, 13] 208
75
[15000 rows x 12 columns]
...
DEBUG: query_type = GSM8K
DEBUG 111
DEBUG: query_type = GSM8K
DEBUG 111
DEBUG: query_type = GSM8K
DEBUG 111
DEBUG: query_type = GSM8K
DEBUG 111
DEBUG: query_type = GSM8K
DEBUG 111
DEBUG: query_type = GSM8K
DEBUG 111
DEBUG: query_type = GSM8K
DEBUG 111
DEBUG: query_type = GSM8K
DEBUG 111
DEBUG: query_type = GSM8K
DEBUG 111
DEBUG: query_type = GSM8K
DEBUG 111
DEBUG: gsm8k_total = 10
DEBUG: tgt = 60.0
DEBUG: tgt = 36.0
DEBUG: tgt = 58.0
DEBUG: tgt = 4.0
DEBUG: tgt = 14000.0
DEBUG: tgt = 120.0
DEBUG: tgt = 5.0
DEBUG: tgt = 46.0
DEBUG: tgt = 18.0
DEBUG: tgt = 66.0
DEBUG: correct = 7
Results
{'gsm8k': 70.0, 'gen_len': 0.0, 'gen_num': 10, 'gen_tok_len': 3174, 'tokens_per_sample': 317.4}
Create dataset with 15 samples: 5 GSM8K samples, 5 Open Orca samples , 5 MBXP samples.
mixtral_15.pkl
- name of this dataset file.
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ time python3 -u main.py --scenario Offline --model-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/ --accuracy --mlperf-conf mlperf.conf --user-conf user.conf --total-sample-count 15 --dataset-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/download_dataset_15_samples/mixtral_15.pkl --output-log-dir ${OUTPUT_LOG_DIR} --device cuda:0
...
Samples run: 10
BatchMaker time: 0.0002014636993408203
Inference time: 218.228013753891
Postprocess time: 0.0005128383636474609
==== Total time: 218.22872805595398
Saving outputs to run_outputs/q4.pkl
Samples run: 11
BatchMaker time: 0.00020122528076171875
Inference time: 229.7228820323944
Postprocess time: 0.0005888938903808594
==== Total time: 229.72367215156555
Saving outputs to run_outputs/q3.pkl
Samples run: 12
BatchMaker time: 0.00020003318786621094
Inference time: 423.8593044281006
Postprocess time: 0.0005216598510742188
==== Total time: 423.8600261211395
Saving outputs to run_outputs/q9.pkl
Samples run: 13
BatchMaker time: 0.00019216537475585938
Inference time: 330.89380168914795
Postprocess time: 0.0005433559417724609
==== Total time: 330.8945372104645
Saving outputs to run_outputs/q12.pkl
Samples run: 14
BatchMaker time: 0.00019311904907226562
Inference time: 266.23928022384644
Postprocess time: 0.0004799365997314453
==== Total time: 266.23995327949524
Saving outputs to run_outputs/q0.pkl
Samples run: 15
BatchMaker time: 0.00022220611572265625
Inference time: 491.44899702072144
Postprocess time: 0.000576019287109375
==== Total time: 491.44979524612427
No warnings encountered during test.
No errors encountered during test.
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Run Completed!
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Destroying SUT...
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Destroying QSL...
real 103m14.302s
user 102m36.134s
sys 1m15.891s
6.93 mins per 1 sample.
It runs on CPU despite of setting --device cuda:0
.
For running on GPU we need to set --dtype float16
and add --device cuda
to main.py(https://github.com/mlcommons/inference/blob/master/language/mixtral-8x7b/main.py#L49)
apollo
- 2):mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ time python3 -u main.py --scenario Offline --model-path /local/mnt/workspace/mmirkina/mixtral_8x7b_r
eference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/ --accuracy --mlperf-conf mlperf.conf --user-conf user.conf --total-sample-count 15 --dataset-path /local/mnt/workspace/mmirkina/mixtral_8x7
b_reference/download_dataset_15_samples/mixtral_15.pkl --output-log-dir ${OUTPUT_LOG_DIR} --device cuda --dtype float16
WARNING:Mixtral-8x7B-Instruct-v0.1-MAIN:Accuracy run will generate the accuracy logs, but the evaluation of the log is not completed yet
Loading dataset...
Finished loading dataset.
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 39/39 [00:34<00:00, 1.12it/s]
Loaded model
Loaded tokenizer
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Starting Benchmark run
IssueQuery started with 15 samples
IssueQuery done
/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/transformers/generation/configuration_utils.py:563: UserWarning: `num_beams` is set to 1. However, `early_stopping` is set to `True` -- this flag
is only used in beam-based generation modes. You should set `num_beams>1` or unset `early_stopping`.
warnings.warn(
Saving outputs to run_outputs/q13.pkl
Samples run: 1
BatchMaker time: 0.0005486011505126953
Inference time: 11.902294874191284
Postprocess time: 0.0007691383361816406
==== Total time: 11.903612613677979
...
Samples run: 14
BatchMaker time: 0.00011944770812988281
Inference time: 9.253939628601074
Postprocess time: 0.00037169456481933594
==== Total time: 9.254430770874023
Saving outputs to run_outputs/q0.pkl
Samples run: 15
BatchMaker time: 0.0001556873321533203
Inference time: 16.992162942886353
Postprocess time: 0.0005273818969726562
==== Total time: 16.99284601211548
No warnings encountered during test.
No errors encountered during test.
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Run Completed!
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Destroying SUT...
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Destroying QSL...
real 4m24.852s
user 22m44.229s
sys 7m23.684s
15 samples - 4 min 24 sec. evaluate accuracy:
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 evaluate-accuracy.py --checkpoint-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/
downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/ --mlperf-accuracy-file ${ACCURACY_LOG_FILE} --dataset-file /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/download_dataset_15_samples/mixtral
_15.pkl --dtype int32
...
Results
{'rouge1': 51.8093, 'rouge2': 23.1958, 'rougeL': 31.7219, 'rougeLsum': 48.2656, 'gsm8k': 80.0, 'mbxp': 20.0, 'gen_len': 4271, 'gen_num': 15, 'gen_tok_len': 4560, 'tokens_per_sample': 304.0}
For checking how it runs on GPU we should use nvtop
.
sudo apt install nvtop
nvtop
According to README.md run reference code for
mixtral-8x7b
.