ggerganov / ggml

Tensor library for machine learning
MIT License
11.32k stars 1.06k forks source link

Use custom GPT-J checkpoint #488

Open mariecwhite opened 1 year ago

mariecwhite commented 1 year ago

I would like to run the ggml/gpt-j version on the MLPerf benchmark. Is it possible to use a fine-tuned GPT-J checkpoint listed here: https://github.com/mlcommons/inference/blob/master/language/gpt-j/README.md#download-gpt-j-model? The pre-trained version used in MLPerf is EleutherAI/gpt-j-6B which is the same as what is used in ggml.

maxng07 commented 1 year ago

Hi, have you tried out on your end? I was looking at doing benchmarking test last week after I converted and quantize the 7B Code Llama for accuracy, so your post definitely interest me.

What I did is download the pytorch file, which is approx 24G from the link you shared. I used one of the older convert-h5-to-ggml.py (https://github.com/ggerganov/ggml/blob/master/examples/gpt-j/convert-h5-to-ggml.py), the newer gguf convert tool produce int error. I was able to run without problems. However, because my machine is only 16G RAM, I run out of memory and the job is killed. I'm pretty positive that it will work if you have a machine with more than 24G RAM.

Please keep me posted on the status of the conversion and the status of the MLPerf testing. I'm interested in the MLPerf too as I am going to run similar. Either post back here or DM me privately.

See below root@master:~/oldllama.cpp/models/gpt-j/checkpoint-final# python3 convert.py ./ 1 Loading checkpoint shards: 33%|██████████████████████████ | 1/3 [00:24<00:49, 24.94s/it] Killed

root@master:~/oldllama.cpp/models/gpt-j/checkpoint-final# ls -l
total 23759020
-rw-r--r-- 1 root root        3110 Jul 20 22:11 README.md
-rw-r--r-- 1 root root        4346 Jun 30 22:06 added_tokens.json
-rw-r--r-- 1 root root        1000 Jun 30 22:06 config.json
-rw-r--r-- 1 root root        5509 Sep  6 08:11 convert.py
-rw-r--r-- 1 root root         124 Jun 30 22:06 generation_config.json
-rw-r--r-- 1 root root      456318 Jun 30 22:06 merges.txt
-rw-r--r-- 1 root root 10004248818 Jun 30 22:22 pytorch_model-00001-of-00003.bin
-rw-r--r-- 1 root root  9983934481 Jun 30 22:22 pytorch_model-00002-of-00003.bin
-rw-r--r-- 1 root root  4332935279 Jun 30 22:14 pytorch_model-00003-of-00003.bin
-rw-r--r-- 1 root root       25834 Jun 30 22:06 pytorch_model.bin.index.json
-rw-r--r-- 1 root root         462 Jun 30 22:06 special_tokens_map.json
-rw-r--r-- 1 root root         810 Jun 30 22:06 tokenizer_config.json
-rw-r--r-- 1 root root     6571300 Jun 30 22:06 trainer_state.json
-rw-r--r-- 1 root root      999186 Jun 30 22:06 vocab.json

root@master:~/oldllama.cpp/models/gpt-j/checkpoint-final# python3 convert.py ./ 1 pytorch_model-00001-of-00003.bin Loading checkpoint shards: 33%|██████████████████████████ | 1/3 [00:20<00:41, 20.91s/it]Killed