mahaitongdae / Feasible-Actor-Critic

Code for paper Feasible Actor-Critic: Constrained Reinforcement Learning for Ensuring Statewise Safety.
MIT License
16 stars 3 forks source link

how to create fewer actors or increase the resources available to this Ray cluster? #2

Open tingtingLiuLiu opened 2 years ago

tingtingLiuLiu commented 2 years ago

I am a newbie using ray.i have a problem /home/ltt/anaconda3/envs/FAC/bin/python /home/ltt/Downloads/Feasible-Actor-Critic/train_script4fsac.py test_dir test_iter_list 2021-12-28 06:22:08.727682: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: :/home/ltt/.mujoco/mujoco200/bin 2021-12-28 06:22:08.727697: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. 2021-12-28 06:22:09.591541: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set 2021-12-28 06:22:09.591685: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: :/home/ltt/.mujoco/mujoco200/bin 2021-12-28 06:22:09.591694: W tensorflow/stream_executor/cuda/cuda_driver.cc:326] failed call to cuInit: UNKNOWN ERROR (303) 2021-12-28 06:22:09.591714: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (ubuntu): /proc/driver/nvidia/version does not exist /home/ltt/anaconda3/envs/FAC/lib/python3.6/site-packages/gym/logger.py:30: UserWarning: WARN: Box bound precision lowered by casting to float32 warnings.warn(colorize('%s: %s'%('WARN', msg % args), 'yellow')) INFO:main:begin training agents with parameter Namespace(act_dim=2, action_range=1.0, alg_name='FAC', alpha='auto', alpha_lr_schedule=[8e-05, 1000000, 3e-06], batch_size=1024, buffer_log_interval=40000, buffer_type='cost', constrained=True, cost_bias=0.0, cost_gamma=0.99, cost_lim=10.0, cost_value_lr_schedule=[8e-05, 4000000, 1e-06], delayed_update=4, demo=False, deterministic_policy=False, double_Q=True, double_QC=False, dual_ascent_interval=12, env_id='Safexp-PointButton1-v0', eval_interval=10000, eval_log_interval=1, eval_render=False, evaluator_type='EvaluatorWithCost', explore_sigma=None, fixed_steps=1000, gamma=0.99, gradient_clip_norm=10.0, grads_max_reuse=2, grads_queue_size=25, lam_gradient_clip_norm=3.0, lam_lr_schedule=[5e-05, 333333, 3e-06], log_dir='./results/FAC/PointButton/PointButton1-2021-12-28-06-22-09/logs', log_interval=100, max_buffer_size=500000, max_iter=4000000, max_sampled_steps=0, max_weight_sync_delay=300, mlp_lam=True, mode='training', model_dir='./results/FAC/PointButton/PointButton1-2021-12-28-06-22-09/models', model_load_dir=None, model_load_ite=None, mu_bias=0.0, num_agent=1, num_batch_reuse=1, num_buffers=4, num_eval_agent=1, num_eval_episode=5, num_future_data=0, num_learners=4, num_workers=4, obs_dim=76, obs_ptype='scale', obs_scale=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0], off_policy=True, optimizer_type='OffPolicyAsyncWithCost', policy_hidden_activation='elu', policy_lr_schedule=[3e-05, 1000000, 1e-06], policy_model_cls='MLP', policy_num_hidden_layers=2, policy_num_hidden_units=256, policy_only=False, policy_out_activation='linear', policy_type='PolicyWithMu', ppc_load_dir=None, random_seed=0, replay_alpha=0.6, replay_batch_size=256, replay_beta=0.4, replay_starts=3000, result_dir='./results/FAC/PointButton/PointButton1-2021-12-28-06-22-09', rew_ptype='scale', rew_scale=1.0, rew_shift=0.0, save_interval=200000, target=True, target_entropy=-2, tau=0.005, test_dir='test_dir', test_iter_list='test_iter_list', value_hidden_activation='elu', value_lr_schedule=[8e-05, 4000000, 1e-06], value_model_cls='MLP', value_num_hidden_layers=2, value_num_hidden_units=256, worker_log_interval=5, worker_type='OffPolicyWorkerWithCost') 2021-12-28 06:22:11,003 INFO services.py:1174 -- View the Ray dashboard at http://127.0.0.1:8265 2021-12-28 06:22:14.252747: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX512F To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2021-12-28 06:22:14.252936: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set INFO:worker:Worker initialized INFO:optimizer:start filling the replay (pid=6552) /home/ltt/anaconda3/envs/FAC/lib/python3.6/site-packages/gym/logger.py:30: UserWarning: WARN: Box bound precision lowered by casting to float32 (pid=6552) warnings.warn(colorize('%s: %s'%('WARN', msg % args), 'yellow')) (pid=6553) /home/ltt/anaconda3/envs/FAC/lib/python3.6/site-packages/gym/logger.py:30: UserWarning: WARN: Box bound precision lowered by casting to float32 (pid=6553) warnings.warn(colorize('%s: %s'%('WARN', msg % args), 'yellow')) (pid=6553) INFO:worker:Worker initialized (pid=6553) INFO:worker:Worker_info: {'worker_id': 1, 'num_sample': 0, 'num_costs': 0, 'cost_rate': 0} 2021-12-28 06:22:32,308 WARNING worker.py:1108 -- The actor or task with ID ffffffffffffffff69a6825d641b461327313d1c01000000 cannot be scheduled right now. It requires {CPU: 1.000000} for placement, but this node only has remaining {0.000000/2.000000 CPU, 0.927734 GiB/0.927734 GiB memory, 1.000000/1.000000 node:192.168.21.140, 0.292969 GiB/0.292969 GiB object_store_memory} . In total there are 0 pending tasks and 11 pending actors on this node. This is likely due to all cluster resources being claimed by actors. To resolve the issue, consider creating fewer actors or increase the resources available to this Ray cluster. You can ignore this message if this Ray cluster is expected to auto-scale.

mahaitongdae commented 2 years ago

Sorry for not being able to get back to you sooner! Try change the

NUM_WORKER = 4 NUM_LEARNER = 4 NUM_BUFFER = 4 in train_scripts4fsac.py for reducing the number of actors. Better to set the ratio worker:learner:buffer=1:1:1.