Closed xhluca closed 2 years ago
@hunterlang have you run into this issue?
I just ran the md5sum on the files and got this:
$ md5sum gpt2*
75a37753dd7a28a2c5df80c28bf06e4e gpt2-merges.txt
d9f1a1235d1390093ade7a7e05d11190 gpt2-vocab.json
However, it wasn't what I expected: https://github.com/facebookresearch/metaseq/issues/81#issuecomment-1122617595
I obtained the gpt2-vocab.json file from here: https://github.com/huggingface/swift-coreml-transformers/blob/master/Resources/gpt2-vocab.json
So i'm guessing I should be getting it from sonewhere else. Is there any reason why the gpt2-vocab.json file is not included in this repo? it would make the process easier.
Do not use the gpt2-vocab.json from there.
You can find the correct vocab and merges here: https://github.com/facebookresearch/metaseq/tree/main/projects/OPT/assets
Thank you @Skyy93 .
@stephenroller @hunterlang Would it be possible to edit #19's issue body with the correct link? This will save from hours of debugging for those that find the gpt2-vocab.json from #19
EDIT: So I read the instructions again and it was indeed mentioned:
Note that the
gpt2-merges.txt
andgpt2-vocab.json
files inprojects/OPT/assets/
will need to be moved to the corresponding directories defined in theconstants.py
file.
I've created a PR that adds instructions on downloading them.
Thank you both @Skyy93 for the community help and @xhlulu for improving documentation as a response.
Following #19, I was able to download the correct files and tried to run OPT-2.7b. I tried the following command to boot the API:
However, I ran into the following problem when trying to run the API: