Hi, I am trying to test the PR submitted for Odex and Conala tasks support. The repository is here . I am able to successfully run the bigcode-evaluation-harness code for inference. However, using the same setup throws me an error when I run the PR code.
Here is the accelerate config used
$ accelerate config
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------In which compute environment are you running?
This machine
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Which type of machine are you using?
multi-GPU
How many different machines will you use (use more than 1 for multi-node training)? [1]: 1
Do you wish to optimize your script with torch dynamo?[yes/NO]:NO
Do you want to use DeepSpeed? [yes/NO]: NO
Do you want to use FullyShardedDataParallel? [yes/NO]: NO
Do you want to use Megatron-LM ? [yes/NO]: NO
How many GPU(s) should be used for distributed training? [1]:1
What GPU(s) (by id) should be used for training on this machine as a comma-seperated list? [all]:all
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Do you wish to use FP16 or BF16 (mixed precision)?
bf16
The following values were not passed to `accelerate launch` and had defaults used instead:
`--dynamo_backend` was set to a value of `'no'`
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
Traceback (most recent call last):
File "main.py", line 10, in <module>
from lm_eval.evaluator import Evaluator
File "/home/rudra/bigcode-evaluation-harness/lm_eval/evaluator.py", line 5, in <module>
from lm_eval import tasks
File "/home/rudra/bigcode-evaluation-harness/lm_eval/tasks/__init__.py", line 3, in <module>
from . import apps, codexglue_code_to_text, conala, concode, humaneval, mbpp, codexglue_text_to_text, odex, mconala
File "/home/rudra/bigcode-evaluation-harness/lm_eval/tasks/codexglue_code_to_text.py", line 56, in <module>
def compute_codexglue_code_to_text_bleu(gold_and_predicted_items: list[tuple[str, str]]):
TypeError: 'type' object is not subscriptable
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 2237438) of binary: /home/rudra/.cache/A100/bin/python
Traceback (most recent call last):
File "/home/rudra/.cache/A100/bin/accelerate", line 8, in <module>
sys.exit(main())
File "/home/rudra/.cache/A100/lib/python3.8/site-packages/accelerate/commands/accelerate_cli.py", line 45, in main
args.func(args)
File "/home/rudra/.cache/A100/lib/python3.8/site-packages/accelerate/commands/launch.py", line 906, in launch_command
multi_gpu_launcher(args)
File "/home/rudra/.cache/A100/lib/python3.8/site-packages/accelerate/commands/launch.py", line 599, in multi_gpu_launcher
distrib_run.run(args)
File "/home/rudra/.cache/A100/lib/python3.8/site-packages/torch/distributed/run.py", line 752, in run
elastic_launch(
File "/home/rudra/.cache/A100/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/rudra/.cache/A100/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
main.py FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2023-03-23_02:34:25
host : cccxc578.pok.ibm.com
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 2237438)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================
Hi, I am trying to test the PR submitted for Odex and Conala tasks support. The repository is here . I am able to successfully run the
bigcode-evaluation-harness
code for inference. However, using the same setup throws me an error when I run the PR code.Here is the accelerate config used
The command used is
This is the error I am getting