krai / axs2mlperf

Automated KRAI X workflows for reproducing MLPerf Inference submissions
https://krai.ai
MIT License
1 stars 1 forks source link

Add recipe for downloading llama2 model and dataset for training #60

Open maria-18-git opened 1 week ago

maria-18-git commented 1 week ago

Instruction from MLCommons: https://github.com/mlcommons/training/tree/master/llama2_70b_lora#download-data-and-model

maria-18-git commented 1 week ago

1. Getting access to MLCommons repo for downloading llama2 model (authenticate Rclone)

y) Yes (default) n) No y/n> Y

2024/09/12 11:44:59 NOTICE: If your browser doesn't open automatically go to the following link: http://127.0.0.1:53682/auth?state=WlWOINK50y1TlAAhRHP--g 2024/09/12 11:44:59 NOTICE: Log in and authorize rclone for access 2024/09/12 11:44:59 NOTICE: Waiting for code... 2024/09/12 11:45:33 NOTICE: Got code channel 5: open failed: connect failed: Connection refused Configure this as a Shared Drive (Team Drive)?

y) Yes n) No (default) y/n> N

During this running you should copy a created link above:

http://127.0.0.1:53682/auth?state=WlWOINK50y1TlAAhRHP--g

to the browser on your  laptop/local machine and input your email and authorize.

# 2. Download model without `axs` (for checking access)
Run without `axs`(important to run as `sudo`)

maria@xe9680-1:/mnt/llm_data/krai/maria$ sudo rclone copy mlc-llama2:Llama2-70b-fused-qkv-mlperf ./Llama2-70b-fused-qkv-mlperf -P [sudo] password for maria: Transferred: 266.828 GiB / 266.828 GiB, 100%, 41.095 MiB/s, ETA 0s Transferred: 166 / 166, 100% Elapsed time: 1h3m8.2s

Results:

maria@xe9680-1:/mnt/llm_data/krai/maria$ ls -la Llama2-70b-fused-qkv-mlperf total 134720280 drwxr-sr-x 3 root krai 4096 Sep 12 12:27 . drwxr-sr-x 3 maria krai 49 Sep 12 11:54 .. -rw-r--r-- 1 root krai 846 Apr 9 21:25 config.json -rw-r--r-- 1 root krai 1208 Apr 9 21:25 convert.py -rw-r--r-- 1 root krai 240 Apr 9 21:25 generation_config.json drwxr-sr-x 8 root krai 4096 Apr 9 22:56 .git -rw-r--r-- 1 root krai 1519 Apr 9 21:25 .gitattributes -rw-r--r-- 1 root krai 4718659640 Apr 9 22:05 model-00001-of-00029.safetensors -rw-r--r-- 1 root krai 4664166688 Apr 9 22:14 model-00002-of-00029.safetensors -rw-r--r-- 1 root krai 4966156576 Apr 9 22:07 model-00003-of-00029.safetensors -rw-r--r-- 1 root krai 4999711016 Apr 9 22:00 model-00004-of-00029.safetensors -rw-r--r-- 1 root krai 4664133712 Apr 9 22:19 model-00005-of-00029.safetensors -rw-r--r-- 1 root krai 4664166712 Apr 9 22:02 model-00006-of-00029.safetensors -rw-r--r-- 1 root krai 4664166704 Apr 9 22:13 model-00007-of-00029.safetensors -rw-r--r-- 1 root krai 4966156592 Apr 9 21:53 model-00008-of-00029.safetensors -rw-r--r-- 1 root krai 4999711032 Apr 9 21:55 model-00009-of-00029.safetensors -rw-r--r-- 1 root krai 4664133712 Apr 9 22:17 model-00010-of-00029.safetensors -rw-r--r-- 1 root krai 4664166712 Apr 9 22:10 model-00011-of-00029.safetensors -rw-r--r-- 1 root krai 4664166704 Apr 9 22:12 model-00012-of-00029.safetensors -rw-r--r-- 1 root krai 4966156592 Apr 9 21:37 model-00013-of-00029.safetensors -rw-r--r-- 1 root krai 4999711032 Apr 9 21:34 model-00014-of-00029.safetensors -rw-r--r-- 1 root krai 4664133712 Apr 9 22:15 model-00015-of-00029.safetensors -rw-r--r-- 1 root krai 4664166712 Apr 9 22:06 model-00016-of-00029.safetensors -rw-r--r-- 1 root krai 4664166704 Apr 9 22:09 model-00017-of-00029.safetensors -rw-r--r-- 1 root krai 4966156592 Apr 9 22:06 model-00018-of-00029.safetensors -rw-r--r-- 1 root krai 4999711032 Apr 9 21:39 model-00019-of-00029.safetensors -rw-r--r-- 1 root krai 4664133712 Apr 9 22:20 model-00020-of-00029.safetensors -rw-r--r-- 1 root krai 4664166712 Apr 9 22:04 model-00021-of-00029.safetensors -rw-r--r-- 1 root krai 4664166704 Apr 9 22:14 model-00022-of-00029.safetensors -rw-r--r-- 1 root krai 4966156592 Apr 9 22:08 model-00023-of-00029.safetensors -rw-r--r-- 1 root krai 4999711032 Apr 9 21:59 model-00024-of-00029.safetensors -rw-r--r-- 1 root krai 4664133712 Apr 9 22:18 model-00025-of-00029.safetensors -rw-r--r-- 1 root krai 4664166712 Apr 9 22:03 model-00026-of-00029.safetensors -rw-r--r-- 1 root krai 4664166704 Apr 9 22:11 model-00027-of-00029.safetensors -rw-r--r-- 1 root krai 4966156592 Apr 9 21:50 model-00028-of-00029.safetensors -rw-r--r-- 1 root krai 3812705984 Apr 9 22:16 model-00029-of-00029.safetensors -rw-r--r-- 1 root krai 71606 Apr 9 21:25 modeling_llama.py -rw-r--r-- 1 root krai 46515 Apr 9 21:25 model.safetensors.index.json -rw-r--r-- 1 root krai 24 Apr 9 21:25 README.md