On running samples I am getting this error. I want to generate code context/documentation in simple language when provided a code in java. For that is codellama better or llama?
myenv) [10:52]:[mehparmar@py029:codellama-main]$ torchrun --nproc_per_node 1 example_infilling.py \
> --ckpt_dir CodeLlama-7b/ \
> --tokenizer_path CodeLlama-7b/tokenizer.model \
> --max_seq_len 192 --max_batch_size 4
[W socket.cpp:464] [c10d] The server socket cannot be initialized on [::]:29500 (errno: 97 - Address family not supported by protocol).
[W socket.cpp:697] [c10d] The client socket cannot be initialized to connect to [localhost]:29500 (errno: 97 - Address family not supported by protocol).
[W socket.cpp:697] [c10d] The client socket cannot be initialized to connect to [localhost]:29500 (errno: 97 - Address family not supported by protocol).
> initializing model parallel with size 1
> initializing ddp with size 1
> initializing pipeline with size 1
Traceback (most recent call last):
File "example_infilling.py", line 79, in <module>
fire.Fire(main)
File "/home/mehparmar/.conda/envs/myenv/lib/python3.8/site-packages/fire/core.py", line 143, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/home/mehparmar/.conda/envs/myenv/lib/python3.8/site-packages/fire/core.py", line 477, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/home/mehparmar/.conda/envs/myenv/lib/python3.8/site-packages/fire/core.py", line 693, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "example_infilling.py", line 18, in main
generator = Llama.build(
File "/vol/etl_jupyterdata1/home/github/public/Sreeramm/codellama-main/llama/generation.py", line 97, in build
assert len(checkpoints) > 0, f"no checkpoint files found in {ckpt_dir}"
AssertionError: no checkpoint files found in CodeLlama-7b/
[2024-03-16 10:54:20,433] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 75378) of binary: /home/mehparmar/.conda/envs/myenv/bin/python
Traceback (most recent call last):
File "/home/mehparmar/.conda/envs/myenv/bin/torchrun", line 8, in <module>
sys.exit(main())
File "/home/mehparmar/.conda/envs/myenv/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 347, in wrapper
return f(*args, **kwargs)
File "/home/mehparmar/.conda/envs/myenv/lib/python3.8/site-packages/torch/distributed/run.py", line 812, in main
run(args)
File "/home/mehparmar/.conda/envs/myenv/lib/python3.8/site-packages/torch/distributed/run.py", line 803, in run
elastic_launch(
File "/home/mehparmar/.conda/envs/myenv/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 135, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/mehparmar/.conda/envs/myenv/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 268, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
example_infilling.py FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2024-03-16_10:54:20
host : py029.lvs.abc.com
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 75378)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
On running samples I am getting this error. I want to generate code context/documentation in simple language when provided a code in java. For that is codellama better or llama?