Open Rinstein opened 3 years ago
It seems to be a bug of checking the pyarrow version. Are you running xparl at multiple machines or a single machine ?
single machine, I have no idea to install the version of pyarrow=True
I guess that parl fails to get the exact pyarrow version in your environment. Please provide your environment information such that we can reproduce the problem. OS / parl version / paddle version.
If you would like to bypass the issue and leave the problem to us, just remove the pyarrow from the environment:
pip uninstall pyarrow
my env is ubuntu 18.04, parl 1.4.3, paddlepaddle 1.8.5, after I remove pyarrow, it happen to this error: Exception: "pyarrow" is provided in "master"'s enviroment, however, it is not found in your current environment. To use "pyarrow" for serialization, please install "pyarrow=False" in your current environment!
Have you restarted the cluster? The cluster has to restart after the environment is updated.
xparl stop
xparl start ...
my env is ubuntu 18.04, parl 1.4.3, paddlepaddle 1.8.5, after I remove pyarrow, it happen to this error: Exception: "pyarrow" is provided in "master"'s enviroment, however, it is not found in your current environment. To use "pyarrow" for serialization, please install "pyarrow=False" in your current environment!
are you using the anaconda or installing all the packages in the original python provided by the operating system ?
I use anaconda to management my enviroment, and I restart the xparl before I run my program
Cloud you provide the log of following commands ?
which xparl
which pip
(parl) lrw@mars-2080tix2:~/pythonProjects/PARL$ which xparl /home/lrw/anaconda3/envs/parl/bin/xparl (parl) lrw@mars-2080tix2:~/pythonProjects/PARL$ which pip /home/lrw/anaconda3/envs/parl/bin/pip (parl) lrw@mars-2080tix2:~/pythonProjects/PARL$
Thanks a lot. I'm afraid that a different python is used to launch the master node. Please provide the log of the following command:
import sys
print(sys.executable)
Hi, can you execute the following processes, and paste the whole log.
test.py
import parl
import sys
print("sys.executable: ", sys.executable)
@parl.remote_class class Agent(object):
def say_hello(self):
print("Hello World!")
parl.connect('localhost:8010') agent = Agent() agent.say_hello() print("done")
2. create `test.sh`
```bash
echo `which python`
echo `which xparl`
echo `which pip`
python -m pip uninstall -y parl
python -m pip uninstall -y pyarrow
python -m pip install parl
echo `which python`
echo `which xparl`
echo `which pip`
xparl stop
xparl start --port 8010 --cpu_num 1
python test.py
sh test.sh
and paste the whole log.(parl) lrw@mars-2080tix2:~/pythonProjects/redesign_macsrl/macsrl_code_only_grpc/VividTestAlgorithm$ sh test.sh /home/lrw/anaconda3/envs/parl/bin/python /home/lrw/anaconda3/envs/parl/bin/xparl /home/lrw/anaconda3/envs/parl/bin/pip Found existing installation: parl 1.4.3 Uninstalling parl-1.4.3: Successfully uninstalled parl-1.4.3 WARNING: Skipping pyarrow as it is not installed. Collecting parl Using cached parl-1.4.3-py2.py3-none-any.whl (574 kB) Requirement already satisfied: termcolor>=1.1.0 in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from parl) (1.1.0) Requirement already satisfied: click in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from parl) (7.1.2) Requirement already satisfied: psutil>=5.6.2 in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from parl) (5.8.0) Requirement already satisfied: flask>=1.0.4 in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from parl) (1.1.2) Requirement already satisfied: pyzmq==18.1.1 in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from parl) (18.1.1) Requirement already satisfied: flask-cors in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from parl) (3.0.10) Requirement already satisfied: tb-nightly==1.15.0a20190801 in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from parl) (1.15.0a20190801) Requirement already satisfied: requests in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from parl) (2.25.1) Requirement already satisfied: tensorboardX==1.8 in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from parl) (1.8) Requirement already satisfied: scipy>=1.0.0 in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from parl) (1.5.2) Requirement already satisfied: grpcio>=1.27.2 in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from parl) (1.35.0) Requirement already satisfied: protobuf>=3.14.0 in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from parl) (3.14.0) Requirement already satisfied: cloudpickle==1.6.0 in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from parl) (1.6.0) Requirement already satisfied: numpy>=1.12.0 in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from tb-nightly==1.15.0a20190801->parl) (1.19.2) Requirement already satisfied: werkzeug>=0.11.15 in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from tb-nightly==1.15.0a20190801->parl) (1.0.1) Requirement already satisfied: setuptools>=41.0.0 in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from tb-nightly==1.15.0a20190801->parl) (52.0.0.post20210125) Requirement already satisfied: six>=1.10.0 in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from tb-nightly==1.15.0a20190801->parl) (1.15.0) Requirement already satisfied: markdown>=2.6.8 in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from tb-nightly==1.15.0a20190801->parl) (3.3.4) Requirement already satisfied: absl-py>=0.4 in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from tb-nightly==1.15.0a20190801->parl) (0.12.0) Requirement already satisfied: wheel>=0.26 in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from tb-nightly==1.15.0a20190801->parl) (0.36.2) Requirement already satisfied: Jinja2>=2.10.1 in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from flask>=1.0.4->parl) (2.11.3) Requirement already satisfied: itsdangerous>=0.24 in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from flask>=1.0.4->parl) (1.1.0) Requirement already satisfied: MarkupSafe>=0.23 in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from Jinja2>=2.10.1->flask>=1.0.4->parl) (1.1.1) Requirement already satisfied: importlib-metadata in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from markdown>=2.6.8->tb-nightly==1.15.0a20190801->parl) (4.0.1) Requirement already satisfied: zipp>=0.5 in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from importlib-metadata->markdown>=2.6.8->tb-nightly==1.15.0a20190801->parl) (3.4.1) Requirement already satisfied: typing-extensions>=3.6.4 in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from importlib-metadata->markdown>=2.6.8->tb-nightly==1.15.0a20190801->parl) (3.7.4.3) Requirement already satisfied: chardet<5,>=3.0.2 in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from requests->parl) (4.0.0) Requirement already satisfied: urllib3<1.27,>=1.21.1 in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from requests->parl) (1.26.4) Requirement already satisfied: certifi>=2017.4.17 in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from requests->parl) (2021.5.30) Requirement already satisfied: idna<3,>=2.5 in /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages (from requests->parl) (2.10) Installing collected packages: parl Successfully installed parl-1.4.3 /home/lrw/anaconda3/envs/parl/bin/python /home/lrw/anaconda3/envs/parl/bin/xparl /home/lrw/anaconda3/envs/parl/bin/pip [06-03 16:02:01 MainThread @logger.py:242] Argv: /home/lrw/anaconda3/envs/parl/bin/xparl stop [06-03 16:02:02 MainThread @utils.py:79] WRN paddlepaddle version: 2.1.0. The dynamic graph version of PARL is under development, not fully tested and supported kill: (22935): No such process kill: (22941): No such process kill: (22947): No such process kill: (22953): No such process [06-03 16:02:02 MainThread @logger.py:242] Argv: /home/lrw/anaconda3/envs/parl/bin/xparl start --port 8010 --cpu_num 1 [06-03 16:02:03 MainThread @utils.py:79] WRN paddlepaddle version: 2.1.0. The dynamic graph version of PARL is under development, not fully tested and supported
# The Parl cluster is started at localhost:8010.
# A local worker with 1 CPUs is connected to the cluster.
# Starting the cluster monitor...
## If you want to check cluster status, please view:
http://192.xxx..xxx.xxx:55325
or call:
xparl status
## If you want to add more CPU resources, please call:
xparl connect --address 192.xxx..xxx.xxx:8010
## If you want to shutdown the cluster, please call:
xparl stop
E0603 16:02:07.813417680 23105 socket_utils_common_posix.cc:223] check for SO_REUSEPORT: {"created":"@1622707327.813408175","description":"SO_REUSEPORT unavailable on compiling system","file":"src/core/lib/iomgr/socket_utils_common_posix.cc","file_line":192} Checking status of log_server...
[06-03 16:02:08 MainThread @logger.py:242] Argv: test.py [06-03 16:02:09 MainThread @utils.py:79] WRN paddlepaddle version: 2.1.0. The dynamic graph version of PARL is under development, not fully tested and supported E0603 16:02:09.160348951 23162 socket_utils_common_posix.cc:223] check for SO_REUSEPORT: {"created":"@1622707329.160340365","description":"SO_REUSEPORT unavailable on compiling system","file":"src/core/lib/iomgr/socket_utils_common_posix.cc","file_line":192} E0603 16:02:09.164652929 22982 socket_utils_common_posix.cc:223] check for SO_REUSEPORT: {"created":"@1622707329.164628521","description":"SO_REUSEPORT unavailable on compiling system","file":"src/core/lib/iomgr/socket_utils_common_posix.cc","file_line":192} sys.executable: /home/lrw/anaconda3/envs/parl/bin/python E0603 16:02:09.653611595 23147 socket_utils_common_posix.cc:223] check for SO_REUSEPORT: {"created":"@1622707329.653597274","description":"SO_REUSEPORT unavailable on compiling system","file":"src/core/lib/iomgr/socket_utils_common_posix.cc","file_line":192} [06-03 16:02:09 MainThread @client.py:435] Remote actors log url: http://192.xxx..xxx.xxx:55325/logs?client_id=192.xxx..xxx.xxx_44497_1622707329 done (parl) lrw@mars-2080tix2:~/pythonProjects/redesign_macsrl/macsrl_code_only_grpc/VividTestAlgorithm$ E0603 16:02:10.525752941 23312 socket_utils_common_posix.cc:223] check for SO_REUSEPORT: {"created":"@1622707330.525740926","description":"SO_REUSEPORT unavailable on compiling system","file":"src/core/lib/iomgr/socket_utils_common_posix.cc","file_line":192} E0603 16:02:11.869578383 23437 socket_utils_common_posix.cc:223] check for SO_REUSEPORT: {"created":"@1622707331.869570318","description":"SO_REUSEPORT unavailable on compiling system","file":"src/core/lib/iomgr/socket_utils_common_posix.cc","file_line":192} (parl) lrw@mars-2080tix2:~/pythonProjects/redesign_macsrl/macsrl_code_only_grpc/VividTestAlgorithm$
It seems that you can run distributed computation with xparl
. Can you try run the A2C example again? It should work as expected now.
sorry, but it does not work, after I start xparl in 8010, then run examples/A2C/train.py, get error as follows (same as before):
/home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages/paddle/fluid/clip.py:779: UserWarning: Caution! 'set_gradient_clip' is not recommended and may be deprecated in future! We recommend a new strategy: set 'grad_clip' when initializing the 'optimizer'. This method can reduce the mistakes, please refer to documention of 'optimizer'.
warnings.warn("Caution! 'set_gradient_clip' is not recommended "
[06-03 16:20:42 MainThread @machine_info.py:88] nvidia-smi -L found gpu count: 2
[06-03 16:20:42 MainThread @machine_info.py:109] WRN Found non-empty CUDA_VISIBLE_DEVICES. But PARL found that Paddle was not complied with CUDA, which may cause issues. Thus PARL will not use GPU.
[06-03 16:20:42 MainThread @machine_info.py:88] nvidia-smi -L found gpu count: 2
[06-03 16:20:42 MainThread @machine_info.py:109] WRN Found non-empty CUDA_VISIBLE_DEVICES. But PARL found that Paddle was not complied with CUDA, which may cause issues. Thus PARL will not use GPU.
E0603 16:20:42.459132368 2635 socket_utils_common_posix.cc:223] check for SO_REUSEPORT: {"created":"@1622708442.459123176","description":"SO_REUSEPORT unavailable on compiling system","file":"src/core/lib/iomgr/socket_utils_common_posix.cc","file_line":192}
Traceback (most recent call last):
File "/home/lrw/pythonProjects/PARL/examples/A2C/train.py", line 222, in
Thanks for your kind and patient reply. I have discussed with @zenghsh3 and we guessed it might result from incorrect environment configuration. May I add your wechat account for further discussion ? (I guess you are the developer from China?)
/home/lrw/anaconda3/envs/parl/bin/python /home/lrw/Downloads/pycharm-community-2021.1.1/plugins/python-ce/helpers/pydev/pydevd.py --multiproc --qt-support=auto --client 127.0.0.1 --port 36145 --file /home/lrw/pythonProjects/PARL/examples/A2C/train.py Connected to pydev debugger (build 211.7142.13) [06-03 11:00:57 MainThread @logger.py:242] Argv: /home/lrw/pythonProjects/PARL/examples/A2C/train.py /home/lrw/pythonProjects/PARL/parl/remote/communication.py:38: FutureWarning: 'pyarrow.default_serialization_context' is deprecated as of 2.0.0 and will be removed in a future version. Use pickle or the pyarrow IPC functionality instead. context = pyarrow.default_serialization_context() [06-03 11:02:02 MainThread @machine_info.py:88] nvidia-smi -L found gpu count: 2 [06-03 11:02:02 MainThread @machine_info.py:109] WRN Found non-empty CUDA_VISIBLE_DEVICES. But PARL found that Paddle was not complied with CUDA, which may cause issues. Thus PARL will not use GPU. /home/lrw/anaconda3/envs/parl/lib/python3.6/site-packages/paddle/fluid/clip.py:779: UserWarning: Caution! 'set_gradient_clip' is not recommended and may be deprecated in future! We recommend a new strategy: set 'grad_clip' when initializing the 'optimizer'. This method can reduce the mistakes, please refer to documention of 'optimizer'. warnings.warn("Caution! 'set_gradient_clip' is not recommended " [06-03 11:02:03 MainThread @machine_info.py:88] nvidia-smi -L found gpu count: 2 [06-03 11:02:03 MainThread @machine_info.py:109] WRN Found non-empty CUDA_VISIBLE_DEVICES. But PARL found that Paddle was not complied with CUDA, which may cause issues. Thus PARL will not use GPU. [06-03 11:02:04 MainThread @machine_info.py:88] nvidia-smi -L found gpu count: 2 [06-03 11:02:04 MainThread @machine_info.py:109] WRN Found non-empty CUDA_VISIBLE_DEVICES. But PARL found that Paddle was not complied with CUDA, which may cause issues. Thus PARL will not use GPU. E0603 11:02:14.516618275 25706 socket_utils_common_posix.cc:223] check for SO_REUSEPORT: {"created":"@1622689334.516609247","description":"SO_REUSEPORT unavailable on compiling system","file":"src/core/lib/iomgr/socket_utils_common_posix.cc","file_line":192} Traceback (most recent call last): File "/home/lrw/Downloads/pycharm-community-2021.1.1/plugins/python-ce/helpers/pydev/pydevd.py", line 1483, in _exec pydev_imports.execfile(file, globals, locals) # execute the script File "/home/lrw/Downloads/pycharm-community-2021.1.1/plugins/python-ce/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "/home/lrw/pythonProjects/PARL/examples/A2C/train.py", line 222, in
learner = Learner(config)
File "/home/lrw/pythonProjects/PARL/examples/A2C/train.py", line 81, in init
self.create_actors()
File "/home/lrw/pythonProjects/PARL/examples/A2C/train.py", line 86, in create_actors
parl.connect(self.config['master_address'])
File "/home/lrw/pythonProjects/PARL/parl/remote/client.py", line 434, in connect
distributed_files)
File "/home/lrw/pythonProjects/PARL/parl/remote/client.py", line 74, in init
self.check_env_consistency()
File "/home/lrw/pythonProjects/PARL/parl/remote/client.py", line 243, in check_env_consistency
raise Exception(error_message)
Exception: Version mismatch: the 'master' is of version 'pyarrow=True'. However, 'pyarrow=4.0.1'is provided in your current environment.
python-BaseException
Process finished with exit code 1