krai / axs2mlperf

Automated KRAI X workflows for reproducing MLPerf Inference submissions
https://krai.ai
MIT License
1 stars 1 forks source link

Run reference code for `mixtral-8x7b` #48

Closed maria-18-git closed 2 months ago

maria-18-git commented 5 months ago

According to README.md run reference code for mixtral-8x7b.

maria-18-git commented 5 months ago

1. Download a repository with reference code

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference$ git clone --recurse-submodules https://github.com/mlcommons/inference --depth 1
...
Receiving objects: 100% (27459/27459), 12.81 MiB | 20.96 MiB/s, done.
Resolving deltas: 100% (20403/20403), done.
Submodule path 'language/bert/DeepLearningExamples/PyTorch/Translation/Transformer/cutlass/tools/external/googletest': checked out '9077ec7efe5b652468ab051e93c67589d5cb8f85'
Submodule path 'vision/medical_imaging/3d-unet-brats19/nnUnet': checked out 'b38c69b345b2f60cd0d053039669e8f988b0c0af'

Directory with the reference code:

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ ls -la
total 100
drwxr-xr-x 2 mmirkina users  4096 Jun 19 06:56 .
drwxr-xr-x 6 mmirkina users  4096 Jun 19 06:56 ..
-rw-r--r-- 1 mmirkina users   342 Jun 19 06:56 build.sh
-rw-r--r-- 1 mmirkina users  3818 Jun 19 06:56 dataset.py
-rw-r--r-- 1 mmirkina users  1907 Jun 19 06:56 Dockerfile
-rw-r--r-- 1 mmirkina users  1811 Jun 19 06:56 Dockerfile.eval
-rw-r--r-- 1 mmirkina users  6565 Jun 19 06:56 evaluate-accuracy.py
-rw-r--r-- 1 mmirkina users  4445 Jun 19 06:56 evaluate_mbxp.py
-rw-r--r-- 1 mmirkina users  1085 Jun 19 06:56 launch.sh
-rw-r--r-- 1 mmirkina users  4490 Jun 19 06:56 main.py
-rw-r--r-- 1 mmirkina users  9328 Jun 19 06:56 README.md
-rw-r--r-- 1 mmirkina users   874 Jun 19 06:56 run_accuracy.sh
-rw-r--r-- 1 mmirkina users   382 Jun 19 06:56 run_offline.sh
-rw-r--r-- 1 mmirkina users   383 Jun 19 06:56 run_server.sh
-rw-r--r-- 1 mmirkina users 16660 Jun 19 06:56 SUT.py

2. Copy mperf.conf to mixtral-8x7b

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ cp ../../mlperf.conf .

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ ls -la
total 104
drwxr-xr-x 2 mmirkina users  4096 Jun 19 07:03 .
drwxr-xr-x 6 mmirkina users  4096 Jun 19 06:56 ..
-rw-r--r-- 1 mmirkina users   342 Jun 19 06:56 build.sh
-rw-r--r-- 1 mmirkina users  3818 Jun 19 06:56 dataset.py
-rw-r--r-- 1 mmirkina users  1907 Jun 19 06:56 Dockerfile
-rw-r--r-- 1 mmirkina users  1811 Jun 19 06:56 Dockerfile.eval
-rw-r--r-- 1 mmirkina users  6565 Jun 19 06:56 evaluate-accuracy.py
-rw-r--r-- 1 mmirkina users  4445 Jun 19 06:56 evaluate_mbxp.py
-rw-r--r-- 1 mmirkina users  1085 Jun 19 06:56 launch.sh
-rw-r--r-- 1 mmirkina users  4490 Jun 19 06:56 main.py
-rw-r--r-- 1 mmirkina users  3996 Jun 19 07:03 mlperf.conf
-rw-r--r-- 1 mmirkina users  9328 Jun 19 06:56 README.md
-rw-r--r-- 1 mmirkina users   874 Jun 19 06:56 run_accuracy.sh
-rw-r--r-- 1 mmirkina users   382 Jun 19 06:56 run_offline.sh
-rw-r--r-- 1 mmirkina users   383 Jun 19 06:56 run_server.sh
-rw-r--r-- 1 mmirkina users 16660 Jun 19 06:56 SUT.py
-rw-r--r-- 1 mmirkina users   234 Jun 19 06:56 user.conf

3. Set python3 version to 3.9

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 --version
Python 3.8.19
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3.9 --version
Python 3.9.19
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.9 1
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ sudo update-alternatives --set python3 /usr/bin/python3.9
update-alternatives: using /usr/bin/python3.9 to provide /usr/bin/python3 (python3) in manual mode
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 --version
Python 3.9.19
maria-18-git commented 5 months ago

4. Install python packages

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip install pybind11==2.10.4
Defaulting to user installation because normal site-packages is not writeable
Collecting pybind11==2.10.4
  Downloading pybind11-2.10.4-py3-none-any.whl (222 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 222.3/222.3 KB 1.3 MB/s eta 0:00:00
Installing collected packages: pybind11
  WARNING: The script pybind11-config is installed in '/local/mnt/workspace/mmirkina/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed pybind11-2.10.4

If you want to use CPU:

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip install torch==2.2.0.dev20231006+cpu --index-url https://download.pytorch.org/whl/nightly/cpu
Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://download.pytorch.org/whl/nightly/cpu
ERROR: Could not find a version that satisfies the requirement torch==2.2.0.dev20231006+cpu (from versions: 2.2.0.dev20231010+cpu, 2.4.0.dev20240421+cpu, 2.4.0.dev20240422+cpu, 2.4.0.dev20240423+cpu, 2.4.0.dev20240424+cpu, 2.4.0.dev20240425+cpu, 2.4.0.dev20240426+cpu, 2.4.0.dev20240427+cpu, 2.4.0.dev20240428+cpu, 2.4.0.dev20240429+cpu, 2.4.0.dev20240430+cpu, 2.4.0.dev20240501+cpu, 2.4.0.dev20240502+cpu, 2.4.0.dev20240503+cpu, 2.4.0.dev20240504+cpu, 2.4.0.dev20240505+cpu, 2.4.0.dev20240506+cpu, 2.4.0.dev20240507+cpu, 2.4.0.dev20240508+cpu, 2.4.0.dev20240509+cpu, 2.4.0.dev20240510+cpu, 2.4.0.dev20240511+cpu, 2.4.0.dev20240512+cpu, 2.4.0.dev20240513+cpu, 2.4.0.dev20240514+cpu, 2.4.0.dev20240515+cpu, 2.4.0.dev20240516+cpu, 2.4.0.dev20240517+cpu, 2.4.0.dev20240518+cpu, 2.4.0.dev20240519+cpu, 2.4.0.dev20240520+cpu, 2.4.0.dev20240521+cpu, 2.4.0.dev20240522+cpu, 2.4.0.dev20240523+cpu, 2.4.0.dev20240524+cpu, 2.4.0.dev20240525+cpu, 2.4.0.dev20240526+cpu, 2.4.0.dev20240527+cpu, 2.4.0.dev20240528+cpu, 2.4.0.dev20240529+cpu, 2.4.0.dev20240530+cpu, 2.4.0.dev20240531+cpu, 2.4.0.dev20240601+cpu, 2.4.0.dev20240602+cpu, 2.4.0.dev20240603+cpu, 2.4.0.dev20240604+cpu, 2.4.0.dev20240605+cpu, 2.4.0.dev20240606+cpu, 2.4.0.dev20240607+cpu, 2.4.0.dev20240608+cpu, 2.4.0.dev20240609+cpu, 2.4.0.dev20240610+cpu, 2.4.0.dev20240611+cpu, 2.4.0.dev20240612+cpu, 2.5.0.dev20240613+cpu, 2.5.0.dev20240614+cpu, 2.5.0.dev20240615+cpu, 2.5.0.dev20240616+cpu, 2.5.0.dev20240617+cpu, 2.5.0.dev20240618+cpu, 2.5.0.dev20240619+cpu)
ERROR: No matching distribution found for torch==2.2.0.dev20231006+cpu

We don't have this version

torch==2.2.0.dev20231006+cpu

Now we have only these

torch-2.2.0.dev20231010+cpu.cxx11.abi-cp310-cp310-linux_x86_64.whl 
torch-2.2.0.dev20231010+cpu.cxx11.abi-cp311-cp311-linux_x86_64.whl 
torch-2.2.0.dev20231010+cpu.cxx11.abi-cp38-cp38-linux_x86_64.whl 
torch-2.2.0.dev20231010+cpu.cxx11.abi-cp39-cp39-linux_x86_64.whl

in https://download.pytorch.org/whl/nightly/torch/ We use torch-2.2.0.dev20231010+cpu

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip install torch==2.2.0.dev20231010+cpu --index-url https://download.pytorch.org/whl/nightly/cpu
Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://download.pytorch.org/whl/nightly/cpu
Collecting torch==2.2.0.dev20231010+cpu
  Downloading https://download.pytorch.org/whl/nightly/cpu/torch-2.2.0.dev20231010%2Bcpu-cp39-cp39-linux_x86_64.whl (185.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 185.1/185.1 MB 10.5 MB/s eta 0:00:00
Requirement already satisfied: typing-extensions in /local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages (from torch==2.2.0.dev20231010+cpu) (4.4.0)
Requirement already satisfied: networkx in /local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages (from torch==2.2.0.dev20231010+cpu) (3.0)
Requirement already satisfied: filelock in /local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages (from torch==2.2.0.dev20231010+cpu) (3.9.0)
Requirement already satisfied: sympy in /local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages (from torch==2.2.0.dev20231010+cpu) (1.12)
Requirement already satisfied: fsspec in /local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages (from torch==2.2.0.dev20231010+cpu) (2023.12.2)
Requirement already satisfied: jinja2 in /usr/lib/python3/dist-packages (from torch==2.2.0.dev20231010+cpu) (3.0.3)
Requirement already satisfied: mpmath>=0.19 in /local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages (from sympy->torch==2.2.0.dev20231010+cpu) (1.3.0)
Installing collected packages: torch
  Attempting uninstall: torch
    Found existing installation: torch 2.1.2+cpu
    Uninstalling torch-2.1.2+cpu:
      Successfully uninstalled torch-2.1.2+cpu
  WARNING: The scripts convert-caffe2-to-onnx, convert-onnx-to-caffe2 and torchrun are installed in '/local/mnt/workspace/mmirkina/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
torchvision 0.16.2+cpu requires torch==2.1.2, but you have torch 2.2.0.dev20231010+cpu which is incompatible.
torchaudio 2.1.2+cpu requires torch==2.1.2, but you have torch 2.2.0.dev20231010+cpu which is incompatible.
Successfully installed torch-2.2.0.dev20231010+cpu
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip install transformers==4.31.0 nltk==3.8.1 evaluate==0.4.0 absl-py==1.4.0 rouge-score==0.1.2 sentencepiece==0.1.99 accelerate==0.21.0
...
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
torchvision 0.16.2+cpu requires torch==2.1.2, but you have torch 2.2.0.dev20231010+cpu which is incompatible.
Successfully installed absl-py-1.4.0 accelerate-0.21.0 aiohttp-3.9.5 aiosignal-1.3.1 async-timeout-4.0.3 charset-normalizer-3.3.2 datasets-2.20.0 dill-0.3.8 evaluate-0.4.0 frozenlist-1.4.1 huggingface-hub-0.23.4 joblib-1.4.2 multidict-6.0.5 multiprocess-0.70.16 nltk-3.8.1 pyarrow-16.1.0 pyarrow-hotfix-0.6 requests-2.32.3 responses-0.18.0 rouge-score-0.1.2 sentencepiece-0.1.99 tokenizers-0.13.3 tqdm-4.66.4 transformers-4.31.0 xxhash-3.4.1 yarl-1.9.4
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip install git+https://github.com/amazon-science/mxeval.git@e09974f990eeaf0c0e8f2b5eaff4be66effb2c86
...
Successfully built mxeval fire
Installing collected packages: termcolor, fire, mxeval
ERROR: For req: mxeval==1.0. Invalid script entry point: <ExportEntry evaluate_functional_correctness = mxeval.evaluate_functional_correctness:None []> - A callable suffix is required. Cf https://packaging.python.org/specifications/entry-points/#use-for-scripts for more information.
maria-18-git commented 5 months ago

If run on CPU:

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip show torch
Name: torch
Version: 2.2.0.dev20231010+cpu
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: packages@pytorch.org
License: BSD-3
Location: /local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages
Requires: filelock, fsspec, jinja2, networkx, sympy, typing-extensions
Required-by: accelerate, torchaudio, torchvision
maria-18-git commented 5 months ago

If run on GPU:

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip install torch
...
Successfully installed nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-8.9.2.26 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.20.5 nvidia-nvjitlink-cu12-12.5.40 nvidia-nvtx-cu12-12.1.105 torch-2.3.1 triton-2.3.1 typing-extensions-4.12.2
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip show torch
Name: torch
Version: 2.3.1
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: packages@pytorch.org
License: BSD-3
Location: /local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages
Requires: filelock, fsspec, jinja2, networkx, nvidia-cublas-cu12, nvidia-cuda-cupti-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-runtime-cu12, nvidia-cudnn-cu12, nvidia-cufft-cu12, nvidia-curand-cu12, nvidia-cusolver-cu12, nvidia-cusparse-cu12, nvidia-nccl-cu12, nvidia-nvtx-cu12, sympy, triton, typing-extensions
Required-by: accelerate, torchaudio, torchvision

For running experiments we also need pandas:

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip show pandas                                                                          
WARNING: Package(s) not found: pandas
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 -m pip install pandas
Defaulting to user installation because normal site-packages is not writeable
Collecting pandas
  Downloading pandas-2.2.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.1/13.1 MB 21.2 MB/s eta 0:00:00
Requirement already satisfied: pytz>=2020.1 in /usr/lib/python3/dist-packages (from pandas) (2022.1)
Collecting tzdata>=2022.7
  Downloading tzdata-2024.1-py2.py3-none-any.whl (345 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 345.4/345.4 KB 36.8 MB/s eta 0:00:00
Requirement already satisfied: numpy>=1.22.4 in /local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages (from pandas) (1.24.1)
Collecting python-dateutil>=2.8.2
  Downloading python_dateutil-2.9.0.post0-py2.py3-none-any.whl (229 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 229.9/229.9 KB 19.8 MB/s eta 0:00:00
Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil>=2.8.2->pandas) (1.16.0)
Installing collected packages: tzdata, python-dateutil, pandas
Successfully installed pandas-2.2.2 python-dateutil-2.9.0.post0 tzdata-2024.1
maria-18-git commented 5 months ago

5. Install loadgen

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ export CUR_DIR=${PWD}
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ cd ../../loadgen/
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/loadgen$ python3 -m pip install .
Defaulting to user installation because normal site-packages is not writeable
Processing /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/loadgen
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Building wheels for collected packages: mlperf_loadgen
  Building wheel for mlperf_loadgen (pyproject.toml) ... done
  Created wheel for mlperf_loadgen: filename=mlperf_loadgen-4.0-cp39-cp39-linux_x86_64.whl size=418285 sha256=714f5348ab9db3d520b72bf3c333a038787394cd31586e4b77a77f3a065f9e16
  Stored in directory: /tmp/pip-ephem-wheel-cache-uo3qp3wa/wheels/35/c2/51/339102eab2197cf953ad0a1e30c6fca1f22390f8702f2e0b21
Successfully built mlperf_loadgen
Installing collected packages: mlperf_loadgen
Successfully installed mlperf_loadgen-4.0
maria-18-git commented 5 months ago

6. Get model (checkpoint)

The latest version of rclone rclone v1.67.0 is already installed.

- run the following command to authenticate with the bucket

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ rclone config create mlc-inference s3 provider=Cloudflare access_key_id=f65ba5eef400db161ea49967de89f47b secret_access_key=fbea333914c292b854f14d3fe232bad6c5407bf0ab1bebf78833c2b359bdfd2b endpoint=https://c2686074cb2caf5cbaf6d134bdba8b47.r2.cloudflarestorage.com [mlc-inference] type = s3 access_key_id = f65ba5eef400db161ea49967de89f47b secret_access_key = fbea333914c292b854f14d3fe232bad6c5407bf0ab1bebf78833c2b359bdfd2b endpoint = https://c2686074cb2caf5cbaf6d134bdba8b47.r2.cloudflarestorage.com provider = Cloudflare


- download the model checkpoint

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624$ time rclone copy mlc-inference:mlcommons-inference-wg-public/mixtral_8x7b/mixtral-8x7b-instruct-v0.1 ./mixtral-8x7b-instruct-v0.1 -P Transferred: 173.982 GiB / 173.982 GiB, 100%, 18.288 MiB/s, ETA 0s Transferred: 42 / 42, 100% Elapsed time: 36m6.7s

real 36m6.834s user 11m57.130s sys 8m50.341s

Results:

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624$ ls -la mixtral-8x7b-instruct-v0.1/ total 182433204 drwxr-xr-x 2 mmirkina docker 4096 Jun 27 09:38 . drwxr-xr-x 3 mmirkina docker 4096 Jun 27 09:02 .. -rw-r--r-- 1 mmirkina docker 803 Jun 24 17:04 config.json -rw-r--r-- 1 mmirkina docker 111 Jun 24 17:04 generation_config.json -rw-r--r-- 1 mmirkina docker 4920052720 Jun 24 17:04 model-00001-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:04 model-00002-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:04 model-00003-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:04 model-00004-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:04 model-00005-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4932504264 Jun 24 17:05 model-00006-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559912 Jun 24 17:05 model-00007-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:05 model-00008-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:05 model-00009-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:06 model-00010-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:06 model-00011-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4999646240 Jun 24 17:06 model-00012-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4798417968 Jun 24 17:06 model-00013-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:07 model-00014-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:07 model-00015-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:07 model-00016-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:07 model-00017-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:08 model-00018-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4932504280 Jun 24 17:08 model-00019-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:08 model-00020-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:08 model-00021-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:09 model-00022-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:09 model-00023-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:09 model-00024-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4932504280 Jun 24 17:09 model-00025-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:10 model-00026-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:10 model-00027-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:10 model-00028-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:10 model-00029-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:11 model-00030-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4932504280 Jun 24 17:11 model-00031-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:11 model-00032-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:11 model-00033-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:12 model-00034-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:12 model-00035-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:12 model-00036-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4999646264 Jun 24 17:12 model-00037-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 4798417968 Jun 24 17:13 model-00038-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 1463862216 Jun 24 17:13 model-00039-of-00039.safetensors -rw-r--r-- 1 mmirkina docker 92659 Jun 24 17:13 model.safetensors.index.json

maria-18-git commented 5 months ago

7. Download dataset

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference$ mkdir dataset
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference$ chmod 775 dataset
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference$ cd dataset/
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/dataset$ sudo -v ; curl https://rclone.org/install.sh | sudo bash
Enter password for mmirkina (QUALPASS):
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  4734  100  4734    0     0   6199      0 --:--:-- --:--:-- --:--:--  6196

The latest version of rclone rclone v1.67.0 is already installed.

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/dataset$ rclone copyurl https://inference.mlcommons-storage.org/mixtral_8x7b%2F2024.06.06_mixtral_15k_v4.pkl ./ -a -P
Transferred:       68.439 MiB / 68.439 MiB, 100%, 46.237 MiB/s, ETA 0s
Transferred:            1 / 1, 100%
Elapsed time:         1.8s

We don't need calibration dataset for accuracy/performance running.

maria-18-git commented 5 months ago

8. Run performance

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ time python3 -u main.py --scenario Offline   --mlperf-conf mlperf.conf --user-conf user.conf --total-sample-count 15 --model-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/ --dataset-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/dataset/2024.06.06_mixtral_15k_v4.pkl --output-log-dir offline-logs --dtype float32 --device cuda:0 2>&1 | tee offline_performance_log.log
Traceback (most recent call last):
  File "/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b/main.py", line 168, in <module>
    main()
  File "/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b/main.py", line 135, in main
    sut = sut_cls(
  File "/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b/SUT.py", line 152, in __init__
    self.data_object = Dataset(self.model_path,
  File "/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b/dataset.py", line 31, in __init__
    self.load_tokenizer()
  File "/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b/dataset.py", line 39, in load_tokenizer
    self.tokenizer = AutoTokenizer.from_pretrained(
  File "/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py", line 902, in from_pretrained
    return tokenizer_class_py.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
  File "/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 2094, in from_pretrained
    raise EnvironmentError(
OSError: Can't load tokenizer for '/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure '/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/' is the correct path to a directory containing all relevant files for a LlamaTokenizer tokenizer.

real    0m6.223s
user    0m3.351s
sys     0m9.660s
maria-18-git commented 5 months ago

The reason of this issue is missing of tokenizer files in downloaded model checkpoint:

tokenizer.json
tokenizer.model
tokenizer_config.json

These files are located in https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1/tree/main

maria-18-git commented 5 months ago

So login to huggingface (https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1/tree/main) and download these files to Windows. Then open cmd and copy to apollo using scp`:

C:\Users\mmirkina\Downloads>scp tokenizer* mmirkina@aus655-apollo-0:
...
tokenizer.json                                                                        100% 1753KB   1.6MB/s   00:01
tokenizer.model                                                                       100%  482KB   3.4MB/s   00:00
tokenizer_config.json                                                                 100% 1466    12.4KB/s   00:00
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1$ cp /usr2/mmirkina/token* ./
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1$ ls -la
total 182435448
drwxr-xr-x 2 mmirkina docker       4096 Jun 27 17:41 .
drwxr-xr-x 4 mmirkina docker       4096 Jun 27 17:20 ..
-rw-r--r-- 1 mmirkina docker        803 Jun 24 17:04 config.json
-rw-r--r-- 1 mmirkina docker        111 Jun 24 17:04 generation_config.json
-rw-r--r-- 1 mmirkina docker 4920052720 Jun 24 17:04 model-00001-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:04 model-00002-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:04 model-00003-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:04 model-00004-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:04 model-00005-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4932504264 Jun 24 17:05 model-00006-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559912 Jun 24 17:05 model-00007-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:05 model-00008-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:05 model-00009-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:06 model-00010-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559920 Jun 24 17:06 model-00011-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4999646240 Jun 24 17:06 model-00012-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4798417968 Jun 24 17:06 model-00013-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:07 model-00014-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:07 model-00015-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:07 model-00016-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:07 model-00017-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:08 model-00018-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4932504280 Jun 24 17:08 model-00019-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:08 model-00020-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:08 model-00021-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:09 model-00022-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:09 model-00023-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:09 model-00024-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4932504280 Jun 24 17:09 model-00025-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:10 model-00026-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:10 model-00027-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:10 model-00028-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:10 model-00029-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:11 model-00030-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4932504280 Jun 24 17:11 model-00031-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:11 model-00032-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:11 model-00033-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:12 model-00034-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:12 model-00035-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4865559944 Jun 24 17:12 model-00036-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4999646264 Jun 24 17:12 model-00037-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 4798417968 Jun 24 17:13 model-00038-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker 1463862216 Jun 24 17:13 model-00039-of-00039.safetensors
-rw-r--r-- 1 mmirkina docker      92659 Jun 24 17:13 model.safetensors.index.json
-rw-r--r-- 1 mmirkina docker       1466 Jun 27 17:41 tokenizer_config.json
-rw-r--r-- 1 mmirkina docker    1795303 Jun 27 17:41 tokenizer.json
-rw-r--r-- 1 mmirkina docker     493443 Jun 27 17:41 tokenizer.model
maria-18-git commented 5 months ago

Run Performance:

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ time python3 -u main.py --scenario Offline   --mlperf-conf mlperf.conf --user-conf user.conf --total-sample-count 15 --model-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/ --dataset-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/dataset/2024.06.06_mixtral_15k_v4.pkl --output-log-dir offline-logs --dtype float32 --device cuda:0 2>&1 | tee offline_performance_log.log
...
Loading dataset...
/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/torch/cuda/__init__.py:619: UserWarning: Can't initialize NVML
  warnings.warn("Can't initialize NVML")
Finished loading dataset.
Loading checkpoint shards: 100%|██████████| 39/39 [00:57<00:00,  1.47s/it]
Loaded model
Loaded tokenizer
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Starting Benchmark run
IssueQuery started with 15000 samples
/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/transformers/generation/configuration_utils.py:563: UserWarning: `num_beams` is set to 1. However, `early_stopping` is set to `True` -- this flag
is only used in beam-based generation modes. You should set `num_beams>1` or unset `early_stopping`.
  warnings.warn(
IssueQuery done

Saving outputs to run_outputs/q13.pkl
Samples run: 1
        BatchMaker time: 0.031223773956298828
        Inference time: 139.80554151535034
        Postprocess time: 0.0006673336029052734
        ==== Total time: 139.83743262290955

Saving outputs to run_outputs/q11.pkl
Samples run: 2
        BatchMaker time: 0.00020241737365722656
        Inference time: 338.96012592315674
        Postprocess time: 0.000946044921875
        ==== Total time: 338.96127438545227
Saving outputs to run_outputs/q10.pkl
Samples run: 3
        BatchMaker time: 0.00020623207092285156
        Inference time: 378.08090806007385
        Postprocess time: 0.0004837512969970703
        ==== Total time: 378.0815980434418
Saving outputs to run_outputs/q7.pkl
Samples run: 4
        BatchMaker time: 0.0001933574676513672
        Inference time: 142.67389917373657
        Postprocess time: 0.0005340576171875
        ==== Total time: 142.6746265888214
Saving outputs to run_outputs/q5.pkl
...
Samples run: 116
        BatchMaker time: 0.0001952648162841797
        Inference time: 138.5476894378662
        Postprocess time: 0.0007069110870361328
        ==== Total time: 138.54859161376953
Saving outputs to run_outputs/q3.pkl
Samples run: 117
        BatchMaker time: 0.00021195411682128906
        Inference time: 255.1372139453888
        Postprocess time: 0.0006382465362548828
        ==== Total time: 255.13806414604187
^C
^C
^C
^C

real    587m51.182s
user    585m56.346s

it was stopped because full experiment took a lot of time. Setting of --total-sample-count 15 as input parameter didn't influence for number of samples in Performance mode. Comment about it: https://github.com/mlcommons/inference/blob/master/language/mixtral-8x7b/main.py#L67

maria-18-git commented 5 months ago

Accuracy

Setting of --total-sample-count for accuracy experiments works correctly.

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ OUTPUT_LOG_DIR=offline-accuracy-logs
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ mkdir -p "run_outputs"

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ time python3 -u main.py --scenario Offline --model-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/ --accuracy --mlperf-conf mlperf.conf --user-conf user.conf --total-sample-count 10 --dataset-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/dataset/2024.06.06_mixtral_15k_v4.pkl --output-log-dir ${OUTPUT_LOG_DIR} --device cuda:0
...
mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ time python3 -u main.py --scenario Offline --model-path /local/mnt/workspace/mmirkina/mixtral_8x7b_r
eference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/ --accuracy --mlperf-conf mlperf.conf --user-conf user.conf --total-sample-count 10 --dataset-path /local/mnt/workspace/mmirkina/mixtral_8x7
b_reference/dataset/2024.06.06_mixtral_15k_v4.pkl --output-log-dir ${OUTPUT_LOG_DIR} --device cuda:0
WARNING:Mixtral-8x7B-Instruct-v0.1-MAIN:Accuracy run will generate the accuracy logs, but the evaluation of the log is not completed yet
Loading dataset...
/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/torch/cuda/__init__.py:619: UserWarning: Can't initialize NVML
  warnings.warn("Can't initialize NVML")
Finished loading dataset.
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 39/39 [00:19<00:00,  2.00it/s]
Loaded model
Loaded tokenizer
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Starting Benchmark run
IssueQuery started with 10 samples
IssueQuery done
/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/transformers/generation/configuration_utils.py:563: UserWarning: `num_beams` is set to 1. However, `early_stopping` is set to `True` -- this flag
is only used in beam-based generation modes. You should set `num_beams>1` or unset `early_stopping`.
  warnings.warn(
.Saving outputs to run_outputs/q8.pkl
Samples run: 1
        BatchMaker time: 0.0005521774291992188
        Inference time: 760.4167733192444
        Postprocess time: 0.0010259151458740234
        ==== Total time: 760.4183514118195
Saving outputs to run_outputs/q9.pkl                                                                                                                                                                               Samples run: 2                                                                                                                                                                                                             BatchMaker time: 0.0002186298370361328
        Inference time: 255.86129760742188
        Postprocess time: 0.0005745887756347656
        ==== Total time: 255.86209082603455
Saving outputs to run_outputs/q7.pkl
Samples run: 3
        BatchMaker time: 0.0002162456512451172
        Inference time: 143.11561179161072
        Postprocess time: 0.0005106925964355469
        ==== Total time: 143.1163387298584
Saving outputs to run_outputs/q6.pkl
Samples run: 4
        BatchMaker time: 0.00020933151245117188
        Inference time: 179.33479189872742
        Postprocess time: 0.0005283355712890625
        ==== Total time: 179.33552956581116
Saving outputs to run_outputs/q0.pkl
Samples run: 5
        BatchMaker time: 0.00020623207092285156
        Inference time: 302.74939727783203
        Postprocess time: 0.0005166530609130859
        ==== Total time: 302.75012016296387
Saving outputs to run_outputs/q1.pkl
Samples run: 6
        BatchMaker time: 0.00020122528076171875
        Inference time: 176.75450086593628
        Postprocess time: 0.0005507469177246094
        ==== Total time: 176.75525283813477
Saving outputs to run_outputs/q4.pkl
Samples run: 7
        BatchMaker time: 0.0002200603485107422
        Inference time: 138.0684790611267
        Postprocess time: 0.0005571842193603516
        ==== Total time: 138.06925630569458
Saving outputs to run_outputs/q5.pkl
Samples run: 8
        BatchMaker time: 0.00020170211791992188
        Inference time: 154.93840098381042
        Postprocess time: 0.0005500316619873047
        ==== Total time: 154.93915271759033
Saving outputs to run_outputs/q3.pkl
Samples run: 9
        BatchMaker time: 0.00019693374633789062
        Inference time: 254.17931580543518
        Postprocess time: 0.0005474090576171875
        ==== Total time: 254.18006014823914
Saving outputs to run_outputs/q2.pkl
Samples run: 10
        BatchMaker time: 0.0002086162567138672
        Inference time: 340.03823614120483
        Postprocess time: 0.0005693435668945312
        ==== Total time: 340.03901410102844

No warnings encountered during test.

No errors encountered during test.
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Run Completed!
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Destroying SUT...
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Destroying QSL...

real    45m41.087s
user    45m36.972s
sys     0m37.635s
maria-18-git commented 5 months ago

But we have an issue when we run evaluate-accuracy.py for accuracy getting(some debug printing added).

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ time python3 evaluate-accuracy.py --checkpoint-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mi
xtral-8x7b-instruct-v0.1/  --mlperf-accuracy-file ${ACCURACY_LOG_FILE} --dataset-file /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/dataset/2024.06.06_mixtral_15k_v4.pkl --dtype int32
/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/torch/cuda/__init__.py:619: UserWarning: Can't initialize NVML
  warnings.warn("Can't initialize NVML")
[nltk_data] Downloading package punkt to
[nltk_data]     /local/mnt/workspace/mmirkina/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
DEBUG: data =        dataset                            id                                           question                                              input  ... stop_sequence       tok_stop_sequence tok_input_len tok_ref_output_len
0       GSM8K                     train.548  Gary manages two Amazon distribution centers. ...  <s> [INST] As an expert problem solver solve s...  ...          </s>                     [2]           657                174
1       GSM8K                    train.6592  The square footage of the two bedrooms in the ...  <s> [INST] As an expert problem solver solve s...  ...          </s>                     [2]           657                118
2       GSM8K                    train.6644  Thomas, Toby, and Rebecca worked a total of 15...  <s> [INST] As an expert problem solver solve s...  ...          </s>                     [2]           662                224
3       GSM8K                    train.3596  Two-thirds of the class have brown eyes. Half ...  <s> [INST] As an expert problem solver solve s...  ...          </s>                     [2]           648                 96
4       GSM8K                    train.5034  Jackie spends 8 hours working, 3 hours of exer...  <s> [INST] As an expert problem solver solve s...  ...          </s>                     [2]           634                 75
...       ...                           ...                                                ...                                                ...  ...           ...                     ...           ...                ...
14995    MBXP  javascript_sumDigitsTwoparts  /**\n * * Write a function to divide a number ...  <s> [INST] Complete the following code. Be con...  ...       \n```\n  [13, 13940, 28832, 13]           137                284
14996    MBXP   javascript_palindromeLambda  /**\n * * Write a function to find palindromes...  <s> [INST] Complete the following code. Be con...  ...       \n```\n  [13, 13940, 28832, 13]           192                 38
14997    MBXP       javascript_removeTuples  /**\n * * Write a function to remove all the t...  <s> [INST] Complete the following code. Be con...  ...       \n```\n  [13, 13940, 28832, 13]           282                 35
14998    MBXP             javascript_posNos  /**\n * * Write a JavaScript function to print...  <s> [INST] Complete the following code. Be con...  ...       \n```\n  [13, 13940, 28832, 13]           142                 31
14999    MBXP        javascript_tupleToDict  /**\n * * Write a function to convert the give...  <s> [INST] Complete the following code. Be con...  ...       \n```\n  [13, 13940, 28832, 13]           208                 75

[15000 rows x 12 columns]
DEBUG: results(mlperf_accuracy_file) =  [{'seq_id': 0, 'qsl_idx': 8, 'data': '610C00000000000046700000000000002970000000000000CF38000000000000100100000000000030100000000000002E010000000000000801000000000000D82C00000000000086010000000000003E0100000000000
03001000000000000100100000000000030100000000000002E0100000000000008010000000000004C170000000000002E01000000000000514100000000000086010000000000006F01000000000000337000000000000021700000000000000D000000000000000D00000000000000480D000000000000100100000000
00008C0A0000000000003570000000000000DE010000000000006903000000000000710100000000000021700000000000004E700000000000003F7000000000000088020000000000002170000000000000627000000000000051700000000000004701000000000000217000000000000044700000000000004E7000000
00000003E70000000000000337000000000000021700000000000000D000000000000000D00000000000000351D00000000000070010000000000009D1000000000000020020000000000002170000000000000627000000000000092310000000000002E0100000000000051410000000000003570000000000000700100
00000000008E020000000000008A4E0000000000001E0100000000000021700000000000004E700000000000006E7000000000000097700000000000002E01000000000000FF020000000000007001000000000000362A00000000000013170000000000006201000000000000C2020000000000003370000000000000530
3000000000000090B000000000000710100000000000021700000000000003E7000000000000033700000000000004E700000000000006E700000000000009E010000000000004070000000000000217000000000000062700000000000005170000000000000470100000000000021700000000000003E70000000000000
337000000000000073700000000000006E70000000000000517000000000000093010000000000008A4E0000000000001E01000000000000337000000000000021700000000000000D000000000000000D000000000000001B5E00000000000010010000000000001E0C000000000000E60D0000000000007001000000000
00013170000000000009301000000000000217000000000000044700000000000004E700000000000003E7000000000000035700000000000001001000000000000E60D000000000000700100000000000013170000000000006201000000000000100100000000000051410000000000002C06000000000000FA01000000
000000EE02000000000000217000000000000044700000000000004E700000000000003E70000000000000830100000000000021700000000000004E700000000000003F70000000000000337000000000000021700000000000000D000000000000000D00000000000000161400000000000035700000000000002170000
0000000003E70000000000000337000000000000073700000000000006E7000000000000051700000000000004701000000000000217000000000000044700000000000004E700000000000003E70000000000000830100000000000021700000000000004E700000000000003F7000000000000033700000000000002170
0000000000000D000000000000000D000000000000001F21000000000000DE01000000000000FA01000000000000DD03000000000000D5300000000000008B01000000000000DD03000000000000DD220000000000004B700000000000000D000000000000000D000000000000004E700000000000003F700000000000008
8020000000000002170000000000000627000000000000051700000000000004701000000000000217000000000000044700000000000004E700000000000003E700000000000000D000000000000003E70000000000000337000000000000073700000000000006E70000000000000517000000000000047010000000000
00217000000000000044700000000000004E700000000000003E70000000000000830100000000000021700000000000004E700000000000003F700000000000000D000000000000000D0000000000000014090000000000001D02000000000000112F000000000000C80100000000000033060000000000002E010000000
00000D5300000000000002A0100000000000014050000000000001001000000000000FD0B0000000000002E010000000000003E0100000000000030010000000000006F01000000000000337000000000000021700000000000000D000000000000000D00000000000000411D000000000000357000000000000042050000
000000004670000000000000297000000000000031260000000000007C010000000000003E01000000000000290100000000000010010000000000008C0600000000000049220000000000004B700000000000000D000000000000000D000000000000003F700000000000004701000000000000450100000000000044700
000000000004E700000000000003E70000000000000830100000000000021700000000000003E70000000000000337000000000000073700000000000006E7000000000000051700000000000003B70000000000000DC0200000000000021700000000000004E700000000000000D000000000000000D000000000000001F
210000000000003570000000000000435D000000000000C801000000000000961600000000000062010000000000003E010000000000000A0300000000000010010000000000008B0300000000000049220000000000004B700000000000000D000000000000000D000000000000004E70000000000000580700000000000
044700000000000004E700000000000003E70000000000000830100000000000021700000000000003E70000000000000337000000000000073700000000000006E7000000000000051700000000000003B70000000000000DC0200000000000021700000000000004E700000000000003B70000000000000880200000000
00002170000000000000627000000000000051700000000000004701000000000000217000000000000044700000000000004E700000000000003E700000000000000D000000000000000D00000000000000821D000000000000C4010000000000002706000000000000C80100000000000049220000000000004B7000000
00000000D000000000000000D0000000000000044700000000000004E700000000000003E70000000000000830100000000000021700000000000003E70000000000000337000000000000073700000000000006E700000000000005170000000000000880200000000000021700000000000006270000000000000517000
00000000004701000000000000217000000000000044700000000000004E700000000000003E700000000000000D000000000000000D000000000000005159000000000000D901000000000000E1020000000000008F0D0000000000004B700000000000000D000000000000000D000000000000004E70000000000000337
00000000000004E700000000000006E700000000000005170000000000000470100000000000021700000000000003E700000000000000D000000000000000D00000000000000C331000000000000230200000000000018060000000000002021000000000000E60100000000000021700000000000004E70000000000000
33700000000000004E700000000000006E700000000000004B700000000000000D000000000000000D000000000000005170000000000000470100000000000021700000000000003E700000000000000D000000000000000D000000000000001F210000000000003570000000000000435D0000000000006F01000000000
000470100000000000021700000000000003E700000000000000A030000000000001001000000000000492200000000000062010000000000003E010000000000004B700000000000000D000000000000000D000000000000003F700000000000004701000000000000450100000000000044700000000000004E70000000
0000003E70000000000000830100000000000021700000000000003E70000000000000337000000000000073700000000000006E700000000000009E01000000000000407000000000000021700000000000003E700000000000003B70000000000000DC0200000000000021700000000000004E700000000000000D00000
0000000000D00000000000000821D000000000000C4010000000000002706000000000000C80100000000000049220000000000004B700000000000000D000000000000000D000000000000003F700000000000004701000000000000217000000000000044700000000000004E700000000000003E70000000000000DC02
00000000000021700000000000004E700000000000000D000000000000000D00000000000000821D000000000000C4010000000000002706000000000000C80100000000000049220000000000004B700000000000000D000000000000000D000000000000003F70000000000000470100000000000021700000000000007
0700000000000003E700000000000000D000000000000000D0000000000000016140000000000003570000000000000100100000000000030100000000000002E010000000000000801000000000000D82C0000000000005D010000000000003E010000000000004701000000000000217000000000000070700000000000
003E7000000000000033700000000000009F0100000000000014110000000000005D01000000000000217000000000000070700000000000003E7000000000000033700000000000000200000000000000', 'token_count': 448}, {'seq_id': 1, 'qsl_idx': 9, 'data': '
...
[3:31](https://dividiti.slack.com/archives/C0258V8LLRJ/p1719844312114499)
DEBUG: query_type =  GSM8K
DEBUG: query_type =  GSM8K
DEBUG: query_type =  GSM8K
DEBUG: query_type =  GSM8K
DEBUG: query_type =  GSM8K
DEBUG: query_type =  GSM8K
DEBUG: query_type =  GSM8K
DEBUG: query_type =  GSM8K
DEBUG: query_type =  GSM8K
DEBUG: query_type =  GSM8K
DEBUG: preds_token_OpenOrca =  []
DEBUG: preds_decoded_text =  []
DEBUG: target_required_OpenOrca =  []
DEBUG: preds, targets =  [] []
DEBUG: model:  dict_items([('predictions', []), ('references', [])])
Traceback (most recent call last):
  File "/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b/evaluate-accuracy.py", line 221, in <module>
    main()
  File "/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b/evaluate-accuracy.py", line 182, in main
    result = metric.compute(
  File "/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/evaluate/module.py", line 432, in compute
    self.add_batch(**inputs)
  File "/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/evaluate/module.py", line 480, in add_batch
    self.selected_feature_format = self._infer_feature_from_batch(batch)
  File "/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/evaluate/module.py", line 552, in _infer_feature_from_batch
    example = dict([(k, v[0]) for k, v in batch.items()])
  File "/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/evaluate/module.py", line 552, in <listcomp>
    example = dict([(k, v[0]) for k, v in batch.items()])
IndexError: list index out of range

real    0m8.781s
user    0m5.032s
sys     0m9.812s

The reason of this issue is full dataset using but results for short run(--total-sample-count 10). So in evaluate-accuracy.py we should use all 3 type of dataset (GSM8K, Open Orca and MBXP ). But we have results only for 10 samples of GSM8K. Full dataset contains 5000 GSM8K samples, 5000 Open Orca samples , 5000 MBXP samples.

maria-18-git commented 5 months ago

If I commented code for Open Orca and MBXP samples we can have:

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 evaluate-accuracy.py --checkpoint-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/  --mlperf-accuracy-file ${ACCURACY_LOG_FILE} --dataset-file /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/dataset/2024.06.06_mixtral_15k_v4.pkl --dtype int32
...
/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/torch/cuda/__init__.py:619: UserWarning: Can't initialize NVML                                                                                      warnings.warn("Can't initialize NVML")                                                                                                                                                                           [nltk_data] Downloading package punkt to
[nltk_data]     /local/mnt/workspace/mmirkina/nltk_data...                                                                                                                                                         [nltk_data]   Package punkt is already up-to-date!                                                                                                                                                                 DEBUG: data =        dataset                            id                                           question                                              input  ... stop_sequence       tok_stop_sequence tok_inp
ut_len tok_ref_output_len
0       GSM8K                     train.548  Gary manages two Amazon distribution centers. ...  <s> [INST] As an expert problem solver solve s...  ...          </s>                     [2]           657
       174
1       GSM8K                    train.6592  The square footage of the two bedrooms in the ...  <s> [INST] As an expert problem solver solve s...  ...          </s>                     [2]           657
       118
2       GSM8K                    train.6644  Thomas, Toby, and Rebecca worked a total of 15...  <s> [INST] As an expert problem solver solve s...  ...          </s>                     [2]           662
       224
3       GSM8K                    train.3596  Two-thirds of the class have brown eyes. Half ...  <s> [INST] As an expert problem solver solve s...  ...          </s>                     [2]           648
        96
4       GSM8K                    train.5034  Jackie spends 8 hours working, 3 hours of exer...  <s> [INST] As an expert problem solver solve s...  ...          </s>                     [2]           634
        75
...       ...                           ...                                                ...                                                ...  ...           ...                     ...           ...
       ...
14995    MBXP  javascript_sumDigitsTwoparts  /**\n * * Write a function to divide a number ...  <s> [INST] Complete the following code. Be con...  ...       \n```\n  [13, 13940, 28832, 13]           137
       284
14996    MBXP   javascript_palindromeLambda  /**\n * * Write a function to find palindromes...  <s> [INST] Complete the following code. Be con...  ...       \n```\n  [13, 13940, 28832, 13]           192
        38
14997    MBXP       javascript_removeTuples  /**\n * * Write a function to remove all the t...  <s> [INST] Complete the following code. Be con...  ...       \n```\n  [13, 13940, 28832, 13]           282
        35
14998    MBXP             javascript_posNos  /**\n * * Write a JavaScript function to print...  <s> [INST] Complete the following code. Be con...  ...       \n```\n  [13, 13940, 28832, 13]           142
        31
14999    MBXP        javascript_tupleToDict  /**\n * * Write a function to convert the give...  <s> [INST] Complete the following code. Be con...  ...       \n```\n  [13, 13940, 28832, 13]           208
        75

[15000 rows x 12 columns]
...
DEBUG: query_type =  GSM8K
DEBUG 111
DEBUG: query_type =  GSM8K
DEBUG 111
DEBUG: query_type =  GSM8K
DEBUG 111
DEBUG: query_type =  GSM8K
DEBUG 111
DEBUG: query_type =  GSM8K
DEBUG 111
DEBUG: query_type =  GSM8K
DEBUG 111
DEBUG: query_type =  GSM8K
DEBUG 111
DEBUG: query_type =  GSM8K
DEBUG 111
DEBUG: query_type =  GSM8K
DEBUG 111
DEBUG: query_type =  GSM8K
DEBUG 111
DEBUG:  gsm8k_total =  10
DEBUG: tgt =  60.0
DEBUG: tgt =  36.0
DEBUG: tgt =  58.0
DEBUG: tgt =  4.0
DEBUG: tgt =  14000.0
DEBUG: tgt =  120.0
DEBUG: tgt =  5.0
DEBUG: tgt =  46.0
DEBUG: tgt =  18.0
DEBUG: tgt =  66.0
DEBUG:  correct =  7

Results

{'gsm8k': 70.0, 'gen_len': 0.0, 'gen_num': 10, 'gen_tok_len': 3174, 'tokens_per_sample': 317.4}
maria-18-git commented 5 months ago

Solution:

Create dataset with 15 samples: 5 GSM8K samples, 5 Open Orca samples , 5 MBXP samples.

mixtral_15.pkl - name of this dataset file.

Accuracy

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ time python3 -u main.py --scenario Offline --model-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/ --accuracy --mlperf-conf mlperf.conf --user-conf user.conf --total-sample-count 15 --dataset-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/download_dataset_15_samples/mixtral_15.pkl --output-log-dir ${OUTPUT_LOG_DIR} --device cuda:0
...
Samples run: 10
        BatchMaker time: 0.0002014636993408203
        Inference time: 218.228013753891
        Postprocess time: 0.0005128383636474609
        ==== Total time: 218.22872805595398
Saving outputs to run_outputs/q4.pkl
Samples run: 11
        BatchMaker time: 0.00020122528076171875
        Inference time: 229.7228820323944
        Postprocess time: 0.0005888938903808594
        ==== Total time: 229.72367215156555
Saving outputs to run_outputs/q3.pkl
Samples run: 12
        BatchMaker time: 0.00020003318786621094
        Inference time: 423.8593044281006
        Postprocess time: 0.0005216598510742188
        ==== Total time: 423.8600261211395
Saving outputs to run_outputs/q9.pkl
Samples run: 13
        BatchMaker time: 0.00019216537475585938
        Inference time: 330.89380168914795
        Postprocess time: 0.0005433559417724609
        ==== Total time: 330.8945372104645
Saving outputs to run_outputs/q12.pkl
Samples run: 14
        BatchMaker time: 0.00019311904907226562
        Inference time: 266.23928022384644
        Postprocess time: 0.0004799365997314453
        ==== Total time: 266.23995327949524
Saving outputs to run_outputs/q0.pkl
Samples run: 15
        BatchMaker time: 0.00022220611572265625
        Inference time: 491.44899702072144
        Postprocess time: 0.000576019287109375
        ==== Total time: 491.44979524612427

No warnings encountered during test.

No errors encountered during test.
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Run Completed!
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Destroying SUT...
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Destroying QSL...

real    103m14.302s
user    102m36.134s
sys     1m15.891s

6.93 mins per 1 sample. It runs on CPU despite of setting --device cuda:0. For running on GPU we need to set --dtype float16 and add --device cuda to main.py(https://github.com/mlcommons/inference/blob/master/language/mixtral-8x7b/main.py#L49)

maria-18-git commented 5 months ago

Accuracy on GPU( all GPUs on apollo - 2):

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ time python3 -u main.py --scenario Offline --model-path /local/mnt/workspace/mmirkina/mixtral_8x7b_r
eference/downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/ --accuracy --mlperf-conf mlperf.conf --user-conf user.conf --total-sample-count 15 --dataset-path /local/mnt/workspace/mmirkina/mixtral_8x7
b_reference/download_dataset_15_samples/mixtral_15.pkl --output-log-dir ${OUTPUT_LOG_DIR} --device cuda --dtype float16
WARNING:Mixtral-8x7B-Instruct-v0.1-MAIN:Accuracy run will generate the accuracy logs, but the evaluation of the log is not completed yet
Loading dataset...
Finished loading dataset.
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 39/39 [00:34<00:00,  1.12it/s]
Loaded model
Loaded tokenizer
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Starting Benchmark run
IssueQuery started with 15 samples
IssueQuery done
/local/mnt/workspace/mmirkina/.local/lib/python3.9/site-packages/transformers/generation/configuration_utils.py:563: UserWarning: `num_beams` is set to 1. However, `early_stopping` is set to `True` -- this flag
is only used in beam-based generation modes. You should set `num_beams>1` or unset `early_stopping`.
  warnings.warn(
Saving outputs to run_outputs/q13.pkl
Samples run: 1
        BatchMaker time: 0.0005486011505126953
        Inference time: 11.902294874191284
        Postprocess time: 0.0007691383361816406
        ==== Total time: 11.903612613677979
...
Samples run: 14
        BatchMaker time: 0.00011944770812988281
        Inference time: 9.253939628601074
        Postprocess time: 0.00037169456481933594
        ==== Total time: 9.254430770874023
Saving outputs to run_outputs/q0.pkl
Samples run: 15
        BatchMaker time: 0.0001556873321533203
        Inference time: 16.992162942886353
        Postprocess time: 0.0005273818969726562
        ==== Total time: 16.99284601211548

No warnings encountered during test.

No errors encountered during test.
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Run Completed!
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Destroying SUT...
INFO:Mixtral-8x7B-Instruct-v0.1-MAIN:Destroying QSL...

real    4m24.852s
user    22m44.229s
sys     7m23.684s

15 samples - 4 min 24 sec. evaluate accuracy:

mmirkina@aus655-apollo-0:/local/mnt/workspace/mmirkina/mixtral_8x7b_reference/inference/language/mixtral-8x7b$ python3 evaluate-accuracy.py --checkpoint-path /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/
downloaded_model_checkpoint_270624/mixtral-8x7b-instruct-v0.1/  --mlperf-accuracy-file ${ACCURACY_LOG_FILE} --dataset-file /local/mnt/workspace/mmirkina/mixtral_8x7b_reference/download_dataset_15_samples/mixtral
_15.pkl --dtype int32
...
Results

{'rouge1': 51.8093, 'rouge2': 23.1958, 'rougeL': 31.7219, 'rougeLsum': 48.2656, 'gsm8k': 80.0, 'mbxp': 20.0, 'gen_len': 4271, 'gen_num': 15, 'gen_tok_len': 4560, 'tokens_per_sample': 304.0}
maria-18-git commented 5 months ago

For checking how it runs on GPU we should use nvtop.

sudo apt install nvtop
nvtop