PaddlePaddle / PARL

A high-performance distributed training framework for Reinforcement Learning
https://parl.readthedocs.io/
Apache License 2.0
3.22k stars 816 forks source link

当运行benchmark/torch/a2c/train.py出现device value error, must be str, paddle.CPUPlace(), paddle.CUDAPlace(), paddle.CUDAPinnedPlace() or paddle.XPUPlace()报错 #963

Closed CCNUdhj closed 1 year ago

CCNUdhj commented 1 year ago

具体报错信息如下: [10-04 22:17:16 MainThread @logger.py:242] Argv: train.py [10-04 22:17:16 MainThread @utils.py:73] paddlepaddle version: 2.3.2. [10-04 22:17:16 MainThread @init.py:27] Have found environment variable PARL_BACKEND=='torch', switching backend framework to [torch] [10-04 22:17:18 MainThread @client.py:439] Remote actors log url: http://192.168.0.103:55469/logs?client_id=192.168.0.103_44197_1664893038 [10-04 22:17:18 MainThread @train.py:78] Creating 5 remote actors to connect. Exception in thread Thread-6: Traceback (most recent call last): File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/threading.py", line 932, in _bootstrap_inner self.run() File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/threading.py", line 870, in run self._target(*self._args, self._kwargs) File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/parl/remote/future_mode/proxy_wrapper_nowait.py", line 92, in _run_object_in_backend raise e File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/parl/remote/future_mode/proxy_wrapper_nowait.py", line 82, in _run_object_in_backend self._xparl_remote_wrapper_obj = remote_wrapper( File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/parl/remote/remote_wrapper.py", line 107, in init raise RemoteError('init', traceback_str) parl.remote.exceptions.RemoteError: [PARL remote error when calling function __init__]: device value error, must be str, paddle.CPUPlace(), paddle.CUDAPlace(), paddle.CUDAPinnedPlace() or paddle.XPUPlace(), but the type of device is device* traceback: Traceback (most recent call last): File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/parl/remote/job.py", line 301, in wait_for_connection obj = cls(args, **kwargs) File "/tmp/tmpppd_x2cb/xparl_282298actor.py", line 54, in init File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py", line 1572, in to return self._to_impl( File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py", line 1634, in _to_impl raise ValueError( ValueError: device value error, must be str, paddle.CPUPlace(), paddle.CUDAPlace(), paddle.CUDAPinnedPlace() or paddle.XPUPlace(), but the type of device is device

Exception in thread Thread-7: Traceback (most recent call last): File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/threading.py", line 932, in _bootstrap_inner Exception in thread Thread-8: Traceback (most recent call last): File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/threading.py", line 932, in _bootstrap_inner self.run() File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/threading.py", line 870, in run self._target(*self._args, self._kwargs) File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/parl/remote/future_mode/proxy_wrapper_nowait.py", line 92, in _run_object_in_backend self.run() File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/threading.py", line 870, in run raise e File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/parl/remote/future_mode/proxy_wrapper_nowait.py", line 82, in _run_object_in_backend self._target(*self._args, *self._kwargs) self._xparl_remote_wrapper_obj = remote_wrapper( File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/parl/remote/future_mode/proxy_wrapper_nowait.py", line 92, in _run_object_in_backend File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/parl/remote/remote_wrapper.py", line 107, in init raise e File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/parl/remote/future_mode/proxy_wrapper_nowait.py", line 82, in _run_object_in_backend raise RemoteError('init', traceback_str) parl.remote.exceptions.RemoteError: [PARL remote error when calling function __init__]: device value error, must be str, paddle.CPUPlace(), paddle.CUDAPlace(), paddle.CUDAPinnedPlace() or paddle.XPUPlace(), but the type of device is device traceback: Traceback (most recent call last): File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/parl/remote/job.py", line 301, in wait_for_connection obj = cls(args, kwargs) File "/tmp/tmprptbj1og/xparl_282297actor.py", line 54, in init File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py", line 1572, in to return self._to_impl( File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py", line 1634, in _to_impl raise ValueError( ValueError: device value error, must be str, paddle.CPUPlace(), paddle.CUDAPlace(), paddle.CUDAPinnedPlace() or paddle.XPUPlace(), but the type of device is device

self._xparl_remote_wrapper_obj = remote_wrapper(

File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/parl/remote/remote_wrapper.py", line 107, in init raise RemoteError('init', traceback_str) parl.remote.exceptions.RemoteError: [PARL remote error when calling function __init__]: device value error, must be str, paddle.CPUPlace(), paddle.CUDAPlace(), paddle.CUDAPinnedPlace() or paddle.XPUPlace(), but the type of device is device traceback: Traceback (most recent call last): File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/parl/remote/job.py", line 301, in wait_for_connection obj = cls(*args, **kwargs) File "/tmp/tmp6qk6l38l/xparl_282262actor.py", line 54, in init File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py", line 1572, in to return self._to_impl( File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py", line 1634, in _to_impl raise ValueError( ValueError: device value error, must be str, paddle.CPUPlace(), paddle.CUDAPlace(), paddle.CUDAPinnedPlace() or paddle.XPUPlace(), but the type of device is device

Exception in thread Thread-4: Traceback (most recent call last): File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/threading.py", line 932, in _bootstrap_inner self.run() File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/threading.py", line 870, in run self._target(*self._args, *self._kwargs) File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/parl/remote/future_mode/proxy_wrapper_nowait.py", line 92, in _run_object_in_backend raise e File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/parl/remote/future_mode/proxy_wrapper_nowait.py", line 82, in _run_object_in_backend self._xparl_remote_wrapper_obj = remote_wrapper( File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/parl/remote/remote_wrapper.py", line 107, in init raise RemoteError('init', traceback_str) parl.remote.exceptions.RemoteError: [PARL remote error when calling function __init__]: device value error, must be str, paddle.CPUPlace(), paddle.CUDAPlace(), paddle.CUDAPinnedPlace() or paddle.XPUPlace(), but the type of device is device traceback: Traceback (most recent call last): File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/parl/remote/job.py", line 301, in wait_for_connection obj = cls(args, **kwargs) File "/tmp/tmpo1u9dq7n/xparl_282314actor.py", line 54, in init File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py", line 1572, in to return self._to_impl( File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py", line 1634, in _to_impl raise ValueError( ValueError: device value error, must be str, paddle.CPUPlace(), paddle.CUDAPlace(), paddle.CUDAPinnedPlace() or paddle.XPUPlace(), but the type of device is device

Exception in thread Thread-5: Traceback (most recent call last): File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/threading.py", line 932, in _bootstrap_inner self.run() File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/threading.py", line 870, in run self._target(*self._args, *self._kwargs) File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/parl/remote/future_mode/proxy_wrapper_nowait.py", line 92, in _run_object_in_backend raise e File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/parl/remote/future_mode/proxy_wrapper_nowait.py", line 82, in _run_object_in_backend self._xparl_remote_wrapper_obj = remote_wrapper( File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/parl/remote/remote_wrapper.py", line 107, in init raise RemoteError('init', traceback_str) parl.remote.exceptions.RemoteError: [PARL remote error when calling function __init__]: device value error, must be str, paddle.CPUPlace(), paddle.CUDAPlace(), paddle.CUDAPinnedPlace() or paddle.XPUPlace(), but the type of device is device traceback: Traceback (most recent call last): File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/parl/remote/job.py", line 301, in wait_for_connection obj = cls(args, **kwargs) File "/tmp/tmpeq8pjl7t/xparl_282299actor.py", line 54, in init File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py", line 1572, in to return self._to_impl( File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/paddle/fluid/dygraph/layers.py", line 1634, in _to_impl raise ValueError( ValueError: device value error, must be str, paddle.CPUPlace(), paddle.CUDAPlace(), paddle.CUDAPinnedPlace() or paddle.XPUPlace(), but the type of device is device

Traceback (most recent call last): File "train.py", line 194, in learner.step() File "train.py", line 92, in step remote_actor.set_weights(latest_params) File "/home/dhj/anaconda3/envs/raylib/lib/python3.8/site-packages/parl/remote/future_mode/proxy_wrapper_nowait.py", line 144, in getattr raise self._xparl_remote_object_exception parl.remote.exceptions.FutureFunctionError: There is an error raised when calling the future function __init__. You can see the detailed error message above, which is printed by another thread.

CCNUdhj commented 1 year ago

版本信息如下: Ubuntu 20.04 python 3.8 parl 2.0.5 torch 1.7.1

好像是model.to()的时候没有调用到torch里面的to函数,已经设置PARL_BACKEND=torch的环境变量了。困惑了我很久,望解惑,真的很感谢。