When I used multiprocessing, SGT makes OverflowError.
This is just for reporting.
I'll consider 1) pyspark instead of pandarallel or 2) split the datasets.
RemoteTraceback Traceback (most recent call last)
RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/dannyanexp/miniconda3/envs/tf/lib/python3.7/multiprocessing/pool.py", line 121, in worker
result = (True, func(*args, *kwds))
File "/home/dannyanexp/miniconda3/envs/tf/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar
return list(map(args))
File "/home/dannyanexp/miniconda3/envs/tf/lib/python3.7/site-packages/pandarallel/pandarallel.py", line 64, in global_worker
return _func(x)
File "/home/dannyanexp/miniconda3/envs/tf/lib/python3.7/site-packages/pandarallel/pandarallel.py", line 120, in wrapper
pickle.dump(result, file)
OverflowError: cannot serialize a bytes object larger than 4 GiB
"""
The above exception was the direct cause of the following exception:
Hi,
When I used multiprocessing, SGT makes OverflowError. This is just for reporting. I'll consider 1) pyspark instead of pandarallel or 2) split the datasets.
RemoteTraceback Traceback (most recent call last) RemoteTraceback: """ Traceback (most recent call last): File "/home/dannyanexp/miniconda3/envs/tf/lib/python3.7/multiprocessing/pool.py", line 121, in worker result = (True, func(*args, *kwds)) File "/home/dannyanexp/miniconda3/envs/tf/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar return list(map(args)) File "/home/dannyanexp/miniconda3/envs/tf/lib/python3.7/site-packages/pandarallel/pandarallel.py", line 64, in global_worker return _func(x) File "/home/dannyanexp/miniconda3/envs/tf/lib/python3.7/site-packages/pandarallel/pandarallel.py", line 120, in wrapper pickle.dump(result, file) OverflowError: cannot serialize a bytes object larger than 4 GiB """
The above exception was the direct cause of the following exception:
OverflowError Traceback (most recent call last)