Closed Anery closed 1 year ago
thanks! but I still failed. here's another problem. I'm in trouble with setting mine_num_processes
greater than 1, seems that a lambda function cannot be pickled:
Traceback (most recent call last):
File "/Users/work/temp/RedPajama-Data/data_prep/cc/cc_net/cc_net/execution.py", line 142, in debug_executor
message = function(*x)
File "/Users/work/temp/RedPajama-Data/data_prep/cc/cc_net/cc_net/mine.py", line 439, in _mine_shard
jsonql.run_pipes(
File "/Users/work/temp/RedPajama-Data/data_prep/cc/cc_net/cc_net/jsonql.py", line 439, in run_pipes
multiprocessing.Pool(
File "/Users/miniconda3/envs/py38/lib/python3.8/multiprocessing/context.py", line 119, in Pool
return Pool(processes, initializer, initargs, maxtasksperchild,
File "/Users/miniconda3/envs/py38/lib/python3.8/multiprocessing/pool.py", line 212, in __init__
self._repopulate_pool()
File "/Users/miniconda3/envs/py38/lib/python3.8/multiprocessing/pool.py", line 303, in _repopulate_pool
return self._repopulate_pool_static(self._ctx, self.Process,
File "/Users/miniconda3/envs/py38/lib/python3.8/multiprocessing/pool.py", line 326, in _repopulate_pool_static
w.start()
File "/Users/miniconda3/envs/py38/lib/python3.8/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/Users/miniconda3/envs/py38/lib/python3.8/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "/Users/miniconda3/envs/py38/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
super().__init__(process_obj)
File "/Users/miniconda3/envs/py38/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
self._launch(process_obj)
File "/Users/miniconda3/envs/py38/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/Users/miniconda3/envs/py38/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object '_mine_shard.<locals>.<lambda>'
Any suggestion will be helpful
Hi @Anery, this might be due to failed installation. Did the following steps run successfully for you (run from the cc directory)?
# Installation
cd `cc_net`
mkdir data
sudo apt-get update
sudo apt install build-essential cmake libboost-system-dev libboost-thread-dev libboost-program-options-dev libboost-test-dev libeigen3-dev zlib1g-dev libbz2-dev liblzma-dev
make install
make lang=en dl_lm
Thanks for your reply, I'm running on macos, some of the pkg are not installed. I'll try on Linux latter
It works well on Linux, I’ll close this issue. Thanks.
Hi, I'm trying to run this test case:
python3 -m cc_net --config config/test_segment.json
but encountered the following error:Are there any possible reasons? Python3.9.6 on MacOS