UncompletedJobError: No output/error stream produced

I am running the CodeGen using the test repository (https://github.com/facebookresearch/CodeGen/tree/main/data/test_dataset) for obfuscation mode run codegen_sources/preprocessing/preprocess.py data/python_test --mode obfuscation --local True --local_parallelism 4 --langs python --train_splits 1 --tokenization_timeout 400 --bpe_timeout 220 --train_bpe_timeout 400 --bpe_mode fast --fastbpe_use_vocab True --fastbpe_vocab_path data/bpe/cpp-java-python/vocab --fastbpe_code_path data/bpe/cpp-java-python/codes --keep_comments False --ncodes 4000 --percent_test_valid 2

I am getting the following error,

`INFO - 05/04/22 15:56:33 - 0:00:00 - Dataset pipeline for /home/sushantk/anaconda3/codeGen/data/python_test

INFO - 05/04/22 15:56:33 - 0:00:00 - ========== Extract and Tokenize ===========
INFO - 05/04/22 15:56:33 - 0:00:00 - Using 4 processors.
INFO - 05/04/22 15:56:33 - 0:00:00 - python: tokenizing and extracting parallel functions in 1 json files ...
INFO - 05/04/22 15:56:33 - 0:00:00 - Number of lines to process: 50
WARNING - 05/04/22 15:56:33 - 0:00:01 - Error obfuscating content Missing parentheses in call to 'print'. Did you mean print('\nThe best BASE85 based alphabet for your setup is: %s' \)? (<unknown>, line 1673) 

WARNING - 05/04/22 15:56:33 - 0:00:01 - Error obfuscating content local variable 'mangledName' referenced before assignment 

WARNING - 05/04/22 15:56:33 - 0:00:01 - Error obfuscating content local variable 'mangledName' referenced before assignment 

WARNING - 05/04/22 15:56:33 - 0:00:01 - Error obfuscating content Missing parentheses in call to 'print'. Did you mean print("Press control+C to stop and show the summary")? (<unknown>, line 43) 

WARNING - 05/04/22 15:56:33 - 0:00:01 - Error obfuscating content local variable 'mangledName' referenced before assignment 

WARNING - 05/04/22 15:56:33 - 0:00:01 - Error obfuscating content local variable 'mangledName' referenced before assignment 

WARNING - 05/04/22 15:56:34 - 0:00:01 - Error obfuscating content Missing parentheses in call to 'print'. Did you mean print("permantly remove file ", file)? (<unknown>, line 374) 

WARNING - 05/04/22 15:56:34 - 0:00:01 - Error obfuscating content local variable 'mangledName' referenced before assignment 

WARNING - 05/04/22 15:56:34 - 0:00:01 - Error obfuscating content invalid syntax (<unknown>, line 426) 

WARNING - 05/04/22 15:56:34 - 0:00:01 - Error obfuscating content Missing parentheses in call to 'print'. Did you mean print("\nBEGIN - expecting GEOS_ERROR)? (<unknown>, line 135) 

WARNING - 05/04/22 15:56:34 - 0:00:01 - Error obfuscating content invalid syntax (<unknown>, line 92) 
                                        WARNING - 05/04/22 15:56:34 - 0:00:01 - Error obfuscating content invalid syntax (<unknown>, line 62) 

100%|██████████| 50/50 [00:00<00:00, 3385.62it/s]
INFO - 05/04/22 15:56:34 - 0:00:01 - Time elapsed: 0.95
WARNING - 05/04/22 15:56:34 - 0:00:01 - Tokenization of /home/sushantk/anaconda3/codeGen/data/python_test/python.001 (1).json.gz:12 errors out of 50 lines(24.00%)
WARNING - 05/04/22 15:56:34 - 0:00:01 - Tokenization of /home/sushantk/anaconda3/codeGen/data/python_test/python.001 (1).json.gz:3 filtered examples in 50 lines(6.00%)

INFO - 05/04/22 15:56:34 - 0:00:01 - ========== Deduplicate and Split ===========
INFO - 05/04/22 15:56:34 - 0:00:02 - all files python.*[0-9].obfuscated.tok regrouped in /home/sushantk/anaconda3/codeGen/data/python_test/python.all.obfuscated.tok .
INFO - 05/04/22 15:56:34 - 0:00:02 - all files python.*[0-9].dictionary.tok regrouped in /home/sushantk/anaconda3/codeGen/data/python_test/python.all.dictionary.tok .
INFO - 05/04/22 15:56:34 - 0:00:02 - shuffling 2 files parallely: python.all.obfuscated.tok, python.all.dictionary.tok
INFO - 05/04/22 15:56:34 - 0:00:02 - python: Deduplication on 'obfuscated' and propagated on other suffixes.
INFO - 05/04/22 15:56:34 - 0:00:02 - python: Duplicated lines for obfuscated: 0 / 35
INFO - 05/04/22 15:56:34 - 0:00:02 - python: valid.obfuscated -> 0 lines
INFO - 05/04/22 15:56:35 - 0:00:02 - python: test.obfuscated -> 0 lines
INFO - 05/04/22 15:56:35 - 0:00:02 - python: train.obfuscated.0 -> 35 lines
INFO - 05/04/22 15:56:35 - 0:00:02 - python: Duplicated lines for dictionary: 0 / 35
INFO - 05/04/22 15:56:35 - 0:00:02 - python: valid.dictionary -> 0 lines
INFO - 05/04/22 15:56:35 - 0:00:02 - python: test.dictionary -> 0 lines
INFO - 05/04/22 15:56:35 - 0:00:02 - python: train.dictionary.0 -> 35 lines
INFO - 05/04/22 15:56:35 - 0:00:02 - Sucessfully regroup, deduplicate and split tokenized data into a train/valid/test sets.

INFO - 05/04/22 15:56:35 - 0:00:02 - ========== Learn BPE ===========
INFO - 05/04/22 15:56:35 - 0:00:02 - No need to train bpe codes, already trained. Codes: data/bpe/cpp-java-python/codes

INFO - 05/04/22 15:56:35 - 0:00:02 - ========== Apply BPE ===========
INFO - 05/04/22 15:56:35 - 0:00:02 - Applying BPE on /home/sushantk/anaconda3/codeGen/data/python_test/python.train.dictionary.0.tok ...
INFO - 05/04/22 15:56:35 - 0:00:02 - Applying BPE on /home/sushantk/anaconda3/codeGen/data/python_test/python.train.obfuscated.0.tok ...
WARNING - 05/04/22 15:56:35 - 0:00:02 - /home/sushantk/anaconda3/codeGen/data/python_test/python.valid.dictionary.tok is not a valid file, cannot to apply BPE on it.
WARNING - 05/04/22 15:56:35 - 0:00:02 - /home/sushantk/anaconda3/codeGen/data/python_test/python.valid.obfuscated.tok is not a valid file, cannot to apply BPE on it.
WARNING - 05/04/22 15:56:35 - 0:00:02 - /home/sushantk/anaconda3/codeGen/data/python_test/python.test.dictionary.tok is not a valid file, cannot to apply BPE on it.
WARNING - 05/04/22 15:56:35 - 0:00:02 - /home/sushantk/anaconda3/codeGen/data/python_test/python.test.obfuscated.tok is not a valid file, cannot to apply BPE on it.
---------------------------------------------------------------------------
UncompletedJobError                       Traceback (most recent call last)
~/anaconda3/codeGen/codegen_sources/preprocessing/preprocess.py in <module>()
    212     args.input_path = os.path.abspath(args.input_path)
    213     multiprocessing.set_start_method("fork")
--> 214     preprocess(args)

~/anaconda3/codeGen/codegen_sources/preprocessing/preprocess.py in preprocess(args)
    103 
    104     dataset.apply_bpe(
--> 105         executor=cluster_apply_bpe, local_parallelism=args.local_parallelism
    106     )
    107     dataset.get_vocab(executor=cluster_train_bpe)

~/anaconda3/codeGen/codegen_sources/preprocessing/dataset_modes/obfuscation_mode.py in apply_bpe(self, executor, local_parallelism)
    127         _bpe_ext = self.bpe.ext
    128         self.bpe.ext += TMP_EXT
--> 129         super().apply_bpe(executor)
    130         self.bpe.ext = _bpe_ext
    131         # restore BPE on obfuscation special tokens

~/anaconda3/codeGen/codegen_sources/preprocessing/dataset_modes/dataset_mode.py in apply_bpe(self, executor, local_parallelism)
    615                 jobs.append(job)
    616         for job in jobs:
--> 617             job.result()
    618         logger.info("BPE done.")
    619         # logger.info("Regrouping BPE")

~/anaconda3/envs/codeGen_env/lib/python3.6/site-packages/submitit/core/core.py in result(self)
    264 
    265     def result(self) -> R:
--> 266         r = self.results()
    267         assert not self._sub_jobs, "You should use `results()` if your job has subtasks."
    268         return r[0]

~/anaconda3/envs/codeGen_env/lib/python3.6/site-packages/submitit/core/core.py in results(self)
    287             return [tp.cast(R, sub_job.result()) for sub_job in self._sub_jobs]
    288 
--> 289         outcome, result = self._get_outcome_and_result()
    290         if outcome == "error":
    291             job_exception = self.exception()

~/anaconda3/envs/codeGen_env/lib/python3.6/site-packages/submitit/core/core.py in _get_outcome_and_result(self)
    382             else:
    383                 message.append(f"No output/error stream produced ! Check: {self.paths.stdout}")
--> 384             raise utils.UncompletedJobError("\n".join(message))
    385         try:
    386             output: tp.Tuple[str, tp.Any] = utils.pickle_load(self.paths.result_pickle)

UncompletedJobError: Job 18686 (task: 0) with path /home/sushantk/anaconda3/codeGen/data/python_test/log/18686_0_result.pkl
has not produced any output (state: FINISHED)
No output/error stream produced ! Check: /home/sushantk/anaconda3/codeGen/data/python_test/log/18686_0_log.out`

After opening the "python.test.dictionary.tok" "python.test.obfuscated.tok", "python.valid.dictionary.tok" "python.valid.obfuscated.tok" are blank, they are not producing anything.

Can you tell why this is happening??

facebookresearch / CodeGen

UncompletedJobError: No output/error stream produced #73