Closed dmivilensky closed 2 years ago
Hi Dmitry,
thank you for your interest in the Code Transformer and for reporting this issue.
As far as I can tell, this is caused by the multiprocessing employed in the script. At some point, the batch sent to a subprocess seems to be too large (Stackoverflow).
Essentially, the multiprocessing call in https://github.com/danielzuegner/code-transformer/blob/539742288747b3fe541575d0ee266e3c3587bfe8/code_transformer/experiments/preprocessing/preprocess-2.py#L298
forwards a batch
object to self.preprocess(..)
that is too large to pickle.
We never observed this error in our experiments, so I hypothesize that java-medium
may contain some very large methods causing the generated ASTs in stage 1 to become huge.
I see several possible solutions here:
batch_size
for preprocessing. Although this will probably slow down the executionexecute_parallel(..)
in https://github.com/danielzuegner/code-transformer/blob/539742288747b3fe541575d0ee266e3c3587bfe8/code_transformer/experiments/preprocessing/preprocess-2.py#L298Please let me know if you could resolve this issue.
Hi!
Thank you for your advice, they successfully help up to process the java-medium dataset. It's interesting fact that I was separately trying to work with the smaller batch sizes and different python versions, but it didn't help et all.
The working combination for me was batch size equal to 10 and python 3.8. I also use only 15 processes, like in the original configuration. In my first run, I used a higher number but decided to use a verified number in a final attempt.
Thanks!
Hi Egor,
The working combination for me was batch size equal to 10 and python 3.8. I also use only 15 processes, like in the original configuration. In my first run, I used a higher number but decided to use a verified number in a final attempt.
thanks for reporting back what worked for you! Sure this will help others that have a similar problem.
Dear Authors,
Thank you very much for your work! I used the scripts for preprocessing the java code2seq-type datasets to test the performance of the method name prediction model on the java-medium dataset later but was faced with a strange error. Everything was fine with the stage1 scripts, but after I ran the stage2 script (with a command like
python -m scripts.run-preprocessing code_transformer/experiments/preprocessing/preprocess-2.yaml java-medium train
, in particular), I have found the process failed (after a couple of hours, with a settingbatch_size: 1000, num_processes: 8
) with such an error:struct.error: 'i' format requires -2147483648 <= number <= 2147483647
The detailed traceback follows:
Spoiler
``` Traceback (most recent calls WITHOUT Sacred internals): joblib.externals.loky.process_executor._RemoteTraceback: """ Traceback (most recent call last): File "/home/ubuntu/.local/lib/python3.7/site-packages/joblib/externals/loky/process_executor.py", line 344, in _sendback_result exception=exception)) File "/home/ubuntu/.local/lib/python3.7/site-packages/joblib/externals/loky/backend/queues.py", line 240, in put self._writer.send_bytes(obj) File "/usr/lib/python3.7/multiprocessing/connection.py", line 200, in send_bytes self._send_bytes(m[offset:offset + size]) File "/usr/lib/python3.7/multiprocessing/connection.py", line 393, in _send_bytes header = struct.pack("!i", n) struct.error: 'i' format requires -2147483648 <= number <= 2147483647 """ The above exception was the direct cause of the following exception: Traceback (most recent calls WITHOUT Sacred internals): File "code_transformer/experiments/preprocessing/preprocess-2.py", line 340, in main Preprocess2Container().run() File "code_transformer/experiments/preprocessing/preprocess-2.py", line 312, in run for batch in dataset_slice) File "/home/ubuntu/.local/lib/python3.7/site-packages/joblib/parallel.py", line 1017, in __call__ self.retrieve() File "/home/ubuntu/.local/lib/python3.7/site-packages/joblib/parallel.py", line 909, in retrieve self._output.extend(job.get(timeout=self.timeout)) File "/home/ubuntu/.local/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 562, in wrap_future_result return future.result(timeout=timeout) File "/usr/lib/python3.7/concurrent/futures/_base.py", line 435, in result return self.__get_result() File "/usr/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result raise self._exception struct.error: 'i' format requires -2147483648 <= number <= 2147483647 ```
Unfortunately, I didn't manage to specify the procedure causing this error. So, my questions are what is the reason why this error appears in java-medium (and not appears on smaller datasets), and how can I resolve this problem (at least, what particular line may produce it – maybe I can catch an exception somewhere on some poor data)?