However, the command fails because the GTP2 vocab file is missing:
Traceback (most recent call last):
File "generate.py", line 104, in <module>
main()
File "generate.py", line 33, in main
model, neox_args = setup_for_inference_or_eval(use_cache=True)
File "/home/m/polycoder/gpt-neox/megatron/utils.py", line 424, in setup_for_inference_or_eval
neox_args.build_tokenizer()
File "/home/m/polycoder/gpt-neox/megatron/neox_arguments/arguments.py", line 121, in build_tokenizer
self.tokenizer = build_tokenizer(self)
File "/home/m/polycoder/gpt-neox/megatron/tokenizer/tokenizer.py", line 40, in build_tokenizer
tokenizer = _GPT2BPETokenizer(args.vocab_file, args.merge_file)
File "/home/m/polycoder/gpt-neox/megatron/tokenizer/tokenizer.py", line 154, in __init__
self.tokenizer = GPT2Tokenizer(
File "/home/m/polycoder/gpt-neox/megatron/tokenizer/gpt2_tokenization.py", line 188, in __init__
self.encoder = json.load(open(vocab_file))
FileNotFoundError: [Errno 2] No such file or directory: 'data/gpt2-vocab.json'
I'm not using Docker but have installed your fork of gpt-neox myself.
Where can I find the missing vocab file? -- Thanks in advance!
I'm trying to generate code from a prompt, as described in the README:
However, the command fails because the GTP2 vocab file is missing:
I'm not using Docker but have installed your fork of gpt-neox myself.
Where can I find the missing vocab file? -- Thanks in advance!