DLR-RM / stable-baselines3

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
https://stable-baselines3.readthedocs.io
MIT License
9.18k stars 1.71k forks source link

I am stuck in the "model.learn", could any expert help? And, what is model.learn excatly? Where is the source code? Thanks. #1359

Closed ZZWang21 closed 1 year ago

ZZWang21 commented 1 year ago

❓ Question

Hi, experts,

I am trying to use stable_baselines to implement A2C on a ur5e robotic arm in ROS.

after the code comes to "model.learn", it stopped.

I have no idea what is going on. and I am not sure where is the "model.learn" function is.

Could any one help? Thanks.

The env is shown as : <TimeLimit<UR5eMoveEnv>>

Checklist

qgallouedec commented 1 year ago

Hi, can you provide the logs? To have a better help, I also advise you to fill the custom env issue template

ZZWang21 commented 1 year ago

Hi, can you provide the logs? To have a better help, I also advise you to fill the custom env issue template

Hi, i am new to this. How can I get the log? Thanks.

qgallouedec commented 1 year ago

The logs are what is printed when you run your code

ZZWang21 commented 1 year ago

The logs are what is printed when you run your code

roslaunch ur5e_rl_training start_training_ur5e_train_all.launch ... logging to /home/walter/.ros/log/8b5cf694-bb3c-11ed-92a2-89c37fc15cc7/roslaunch-walter-virtual-machine-4897.log Checking log directory for disk usage. This may take a while. Press Ctrl-C to interrupt Done checking log file disk usage. Usage is <1GB.

started roslaunch server http://walter-virtual-machine:42929/

SUMMARY

PARAMETERS

NODES / ur5e_train_all (ur5e_rl_training/ur5e_train_all.py)

auto-starting new master process[master]: started with pid [4905] ROS_MASTER_URI=http://localhost:11311

setting /run_id to 8b5cf694-bb3c-11ed-92a2-89c37fc15cc7 process[rosout-1]: started with pid [4915] started core service [/rosout] process[ur5e_train_all-2]: started with pid [4922] TTTTTHE 1st ENV IS::::::::: ur5e-v0 /bin/sh: 1: source: not found /bin/sh: 1: source: not found ... logging to /home/walter/.ros/log/8b5cf694-bb3c-11ed-92a2-89c37fc15cc7/roslaunch-walter-virtual-machine-4938.log Checking log directory for disk usage. This may take a while. Press Ctrl-C to interrupt Done checking log file disk usage. Usage is <1GB.

... logging to /home/walter/.ros/log/8b5cf694-bb3c-11ed-92a2-89c37fc15cc7/roslaunch-walter-virtual-machine-4954.log Checking log directory for disk usage. This may take a while. Press Ctrl-C to interrupt Done checking log file disk usage. Usage is <1GB.

started roslaunch server http://walter-virtual-machine:33165/

SUMMARY

PARAMETERS

NODES / gazebo (gazebo_ros/gzserver) gazebo_gui (gazebo_ros/gzclient)

ROS_MASTER_URI=http://localhost:11311

process[gazebo-1]: started with pid [4973] process[gazebo_gui-2]: started with pid [4978] xacro: in-order processing became default in ROS Melodic. You can drop the option. [ INFO] [1678010432.613854084]: Finished loading Gazebo ROS API Plugin. [ INFO] [1678010432.627179981]: waitForService: Service [/gazebo/set_physics_properties] has not been advertised, waiting... [ INFO] [1678010432.662533806]: waitForService: Service [/gazebo_gui/set_physics_properties] has not been advertised, waiting... [ INFO] [1678010432.666584499]: Finished loading Gazebo ROS API Plugin. [ INFO] [1678010433.595421233]: waitForService: Service [/gazebo/set_physics_properties] is now available. started roslaunch server http://walter-virtual-machine:45657/

SUMMARY

PARAMETERS

NODES / controller_spawner (controller_manager/spawner) controller_spawner_hand (controller_manager/spawner) joint_position_launcher (controller_manager/spawner) move_group (moveit_ros_move_group/move_group) robot_control_node (ur5_robotiq_85_simulation/robot_control_node) robot_state_publisher (robot_state_publisher/robot_state_publisher) spawn_urdf (gazebo_ros/spawn_model)

ROS_MASTER_URI=http://localhost:11311

[ INFO] [1678010433.769437206, 0.082000000]: Physics dynamic reconfigure ready. process[spawn_urdf-1]: started with pid [5148] process[controller_spawner_hand-2]: started with pid [5149] process[controller_spawner-3]: started with pid [5150] process[robot_state_publisher-4]: started with pid [5151] process[move_group-5]: started with pid [5152] [ WARN] [1678010434.241061877]: link 'camera_link' material 'red' undefined. [ WARN] [1678010434.242762662]: link 'camera_link' material 'red' undefined. [ WARN] [1678010434.243021720]: link 'shoot_link' material 'red' undefined. [ WARN] [1678010434.243119950]: link 'shoot_link' material 'red' undefined. process[robot_control_node-6]: started with pid [5158] process[joint_position_launcher-7]: started with pid [5160] [ INFO] [1678010434.554015540]: RUNNING robot_control_node [ WARN] [1678010434.616366739]: link 'camera_link' material 'red' undefined. [ WARN] [1678010434.634223548]: link 'camera_link' material 'red' undefined. [ WARN] [1678010434.634488974]: link 'shoot_link' material 'red' undefined. [ WARN] [1678010434.634572173]: link 'shoot_link' material 'red' undefined. [ INFO] [1678010434.639235647]: Loading robot model 'ur5e_robotiq'... [ WARN] [1678010434.690194915]: link 'camera_link' material 'red' undefined. [ WARN] [1678010434.694695486]: link 'camera_link' material 'red' undefined. [ WARN] [1678010434.694813787]: link 'shoot_link' material 'red' undefined. [ WARN] [1678010434.694896638]: link 'shoot_link' material 'red' undefined. [ INFO] [1678010434.709890550]: Loading robot model 'ur5e_robotiq'... [INFO] [1678010437.294230, 0.000000]: Waiting for /clock to be available... [INFO] [1678010437.398859, 0.000000]: Waiting for /clock to be available... [INFO] [1678010437.525993, 3.663000]: /clock is published. Proceeding to load the controller(s). [INFO] [1678010437.532659, 3.669000]: Controller Spawner: Waiting for service controller_manager/load_controller [INFO] [1678010437.579511, 3.722000]: /clock is published. Proceeding to load the controller(s). [INFO] [1678010437.582498, 3.722000]: Controller Spawner: Waiting for service controller_manager/load_controller [ INFO] [1678010437.711132296, 3.834000000]: Publishing maintained planning scene on 'monitored_planning_scene' [ INFO] [1678010437.736079771, 3.850000000]: Listening to 'joint_states' for joint states [ INFO] [1678010437.780779366, 3.881000000]: Listening to '/attached_collision_object' for attached collision objects [ INFO] [1678010437.781029353, 3.881000000]: Starting planning scene monitor [ INFO] [1678010437.792430856, 3.898000000]: Listening to '/planning_scene' [ INFO] [1678010437.792618318, 3.898000000]: Starting world geometry update monitor for collision objects, attached objects, octomap updates. [ INFO] [1678010437.825183022, 3.913000000]: Listening to '/collision_object' [ INFO] [1678010437.896457620, 3.980000000]: Listening to '/planning_scene_world' for planning scene world geometry Warning [parser.cc:833] XML Attribute[frame] in element[pose] not defined in SDF, ignoring. [ INFO] [1678010440.302459422, 4.876000000]: Listening to '/camera/depth_registered/points' using message filter with target frame 'world ' [ INFO] [1678010440.302647619, 4.876000000]: Loading planning pipeline 'chomp' [spawn_urdf-1] process has finished cleanly log file: /home/walter/.ros/log/8b5cf694-bb3c-11ed-92a2-89c37fc15cc7/spawn_urdf-1*.log [ INFO] [1678010440.397276537, 4.876000000]: Using planning interface 'CHOMP' [ INFO] [1678010440.404427497, 4.876000000]: Param 'default_workspace_bounds' was not set. Using default value: 10 [ INFO] [1678010440.405245899, 4.876000000]: Param 'start_state_max_bounds_error' was set to 0.1 [ INFO] [1678010440.405937154, 4.876000000]: Param 'start_state_max_dt' was not set. Using default value: 0.5 [ INFO] [1678010440.406686335, 4.876000000]: Param 'start_state_max_dt' was not set. Using default value: 0.5 [ INFO] [1678010440.407344441, 4.876000000]: Param 'jiggle_fraction' was set to 0.05 [ INFO] [1678010440.408039587, 4.876000000]: Param 'max_sampling_attempts' was not set. Using default value: 100 [ INFO] [1678010440.408185623, 4.876000000]: Using planning request adapter 'Limiting Max Cartesian Speed' [ INFO] [1678010440.408282915, 4.876000000]: Using planning request adapter 'Add Time Parameterization' [ INFO] [1678010440.408365231, 4.876000000]: Using planning request adapter 'Resolve constraint frames to robot links' [ INFO] [1678010440.408423228, 4.876000000]: Using planning request adapter 'Fix Workspace Bounds' [ INFO] [1678010440.408523153, 4.876000000]: Using planning request adapter 'Fix Start State Bounds' [ INFO] [1678010440.408578030, 4.876000000]: Using planning request adapter 'Fix Start State In Collision' [ INFO] [1678010440.408634190, 4.876000000]: Using planning request adapter 'Fix Start State Path Constraints' [ INFO] [1678010440.411807551, 4.876000000]: Loading planning pipeline 'ompl' [ INFO] [1678010440.601365339, 4.876000000]: Using planning interface 'OMPL' [ INFO] [1678010440.607247452, 4.876000000]: Param 'default_workspace_bounds' was not set. Using default value: 10 [ INFO] [1678010440.608693022, 4.876000000]: Param 'start_state_max_bounds_error' was set to 0.1 [ INFO] [1678010440.610733767, 4.876000000]: Param 'start_state_max_dt' was not set. Using default value: 0.5 [ INFO] [1678010440.617637997, 4.876000000]: Param 'start_state_max_dt' was not set. Using default value: 0.5 [ INFO] [1678010440.619139568, 4.876000000]: Param 'jiggle_fraction' was set to 0.05 [ INFO] [1678010440.620529698, 4.876000000]: Param 'max_sampling_attempts' was not set. Using default value: 100 [ INFO] [1678010440.620830891, 4.876000000]: Using planning request adapter 'Limiting Max Cartesian Speed' [ INFO] [1678010440.620924805, 4.876000000]: Using planning request adapter 'Add Time Parameterization' [ INFO] [1678010440.621270970, 4.876000000]: Using planning request adapter 'Resolve constraint frames to robot links' [ INFO] [1678010440.621796680, 4.876000000]: Using planning request adapter 'Fix Workspace Bounds' [ INFO] [1678010440.622577362, 4.876000000]: Using planning request adapter 'Fix Start State Bounds' [ INFO] [1678010440.622791465, 4.876000000]: Using planning request adapter 'Fix Start State In Collision' [ INFO] [1678010440.623017413, 4.876000000]: Using planning request adapter 'Fix Start State Path Constraints' [ INFO] [1678010440.623683515, 4.876000000]: Loading planning pipeline 'pilz_industrial_motion_planner' [ INFO] [1678010440.637631839, 4.876000000]: Reading limits from namespace /robot_description_planning [ INFO] [1678010440.707331099, 4.876000000]: Available plugins: pilz_industrial_motion_planner::PlanningContextLoaderCIRC pilz_industrial_motion_planner::PlanningContextLoaderLIN pilz_industrial_motion_planner::PlanningContextLoaderPTP [ INFO] [1678010440.707544005, 4.876000000]: About to load: pilz_industrial_motion_planner::PlanningContextLoaderCIRC [ INFO] [1678010440.714777012, 4.876000000]: Registered Algorithm [CIRC] [ INFO] [1678010440.715061467, 4.876000000]: About to load: pilz_industrial_motion_planner::PlanningContextLoaderLIN [ INFO] [1678010440.721065901, 4.876000000]: Registered Algorithm [LIN] [ INFO] [1678010440.721265228, 4.876000000]: About to load: pilz_industrial_motion_planner::PlanningContextLoaderPTP [ INFO] [1678010440.726826242, 4.876000000]: Registered Algorithm [PTP] [ INFO] [1678010440.726964704, 4.876000000]: Using planning interface 'Pilz Industrial Motion Planner' [ INFO] [1678010441.223054685, 4.876000000]: Camera Plugin: Using the 'robotNamespace' param: '/' [ INFO] [1678010441.236819641, 4.876000000]: Camera Plugin (ns = /) , set to "" [ INFO] [1678010441.561458231, 4.876000000]: Loading gazebo_ros_control plugin [ INFO] [1678010441.563265582, 4.876000000]: Starting gazebo_ros_control plugin in namespace: / [ INFO] [1678010441.569835909, 4.876000000]: gazebo_ros_control plugin is waiting for model URDF in parameter [robot_description] on the ROS param server. [ERROR] [1678010441.754699245, 4.876000000]: No p gain specified for pid. Namespace: /gazebo_ros_control/pid_gains/shoulder_pan_joint [ERROR] [1678010441.756973261, 4.876000000]: No p gain specified for pid. Namespace: /gazebo_ros_control/pid_gains/shoulder_lift_joint [ERROR] [1678010441.759291136, 4.876000000]: No p gain specified for pid. Namespace: /gazebo_ros_control/pid_gains/elbow_joint [ERROR] [1678010441.766169784, 4.876000000]: No p gain specified for pid. Namespace: /gazebo_ros_control/pid_gains/wrist_1_joint [ERROR] [1678010441.768608052, 4.876000000]: No p gain specified for pid. Namespace: /gazebo_ros_control/pid_gains/wrist_2_joint [ERROR] [1678010441.770926806, 4.876000000]: No p gain specified for pid. Namespace: /gazebo_ros_control/pid_gains/wrist_3_joint [ WARN] [1678010441.771516572, 4.876000000]: Deprecated syntax, please prepend 'hardware_interface/' to 'PositionJointInterface' within the tag in joint 'simple_gripper_gripper_finger1_joint'. [ERROR] [1678010441.776238771, 4.876000000]: No p gain specified for pid. Namespace: /gazebo_ros_control/pid_gains/simple_gripper_gripper_finger1_joint [INFO] [1678010441.814486, 4.876000]: Controller Spawner: Waiting for service controller_manager/switch_controller [INFO] [1678010441.831326, 4.876000]: Controller Spawner: Waiting for service controller_manager/unload_controller [INFO] [1678010441.840915, 4.876000]: Controller Spawner: Waiting for service controller_manager/switch_controller [ INFO] [1678010441.836743258, 4.876000000]: Loaded gazebo_ros_control. [INFO] [1678010441.861013, 4.876000]: Loading controller: arm_controller [INFO] [1678010441.868768, 4.876000]: Controller Spawner: Waiting for service controller_manager/unload_controller [INFO] [1678010441.891904, 4.876000]: Loading controller: gripper_controller [ WARN] [1678010442.194881347]: link 'camera_link' material 'red' undefined. [ WARN] [1678010442.204909679]: link 'camera_link' material 'red' undefined. [ WARN] [1678010442.205092499]: link 'shoot_link' material 'red' undefined. [ WARN] [1678010442.205218843]: link 'shoot_link' material 'red' undefined. [ INFO] [1678010442.212775290]: Loading robot model 'ur5e_robotiq'... [INFO] [1678010442.267820, 4.935000]: Loading controller: joint_state_controller [INFO] [1678010442.520855, 4.975000]: Controller Spawner: Loaded controllers: gripper_controller [ INFO] [1678010442.549758336, 4.977000000]: Added FollowJointTrajectory controller for arm_controller [INFO] [1678010442.578512, 4.979000]: Controller Spawner: Loaded controllers: arm_controller, joint_state_controller [INFO] [1678010442.593643, 4.980000]: Started controllers: gripper_controller [INFO] [1678010442.619565, 4.982000]: Started controllers: arm_controller, joint_state_controller [ INFO] [1678010442.926821504, 5.041000000]: Added FollowJointTrajectory controller for gripper_controller [ WARN] [1678010459.830579483, 10.049000000]: Waiting for joint_position_controller/follow_joint_trajectory to come up [ WARN] [1678010474.401624596, 16.051000000]: Waiting for joint_position_controller/follow_joint_trajectory to come up [ERROR] [1678010485.504695914, 22.051000000]: Action client not connected: joint_position_controller/follow_joint_trajectory [ INFO] [1678010485.581638919, 22.087000000]: Returned 2 controllers in list [ INFO] [1678010485.703160106, 22.130000000]: Trajectory execution is managing controllers [ INFO] [1678010485.703327531, 22.130000000]: MoveGroup debug mode is OFF Loading 'move_group/ApplyPlanningSceneService'... Loading 'move_group/ClearOctomapService'... Loading 'move_group/MoveGroupCartesianPathService'... Loading 'move_group/MoveGroupExecuteTrajectoryAction'... Loading 'move_group/MoveGroupGetPlanningSceneService'... Loading 'move_group/MoveGroupKinematicsService'... Loading 'move_group/MoveGroupMoveAction'... Loading 'move_group/MoveGroupPickPlaceAction'... Loading 'move_group/MoveGroupPlanService'... Loading 'move_group/MoveGroupQueryPlannersService'... Loading 'move_group/MoveGroupStateValidationService'... Loading 'pilz_industrial_motion_planner/MoveGroupSequenceAction'... [ INFO] [1678010486.118362291, 22.195000000]: initialize move group sequence action [ INFO] [1678010486.186011266, 22.207000000]: Reading limits from namespace /robot_description_planning Loading 'pilz_industrial_motion_planner/MoveGroupSequenceService'... [ INFO] [1678010486.327808304, 22.227000000]: Reading limits from namespace /robot_description_planning [ INFO] [1678010486.479296011, 22.253000000]:


[ INFO] [1678010486.483545751, 22.253000000]: MoveGroup context using planning plugin ompl_interface/OMPLPlanner [ INFO] [1678010486.483746426, 22.253000000]: MoveGroup context initialization complete

You can start planning now!

[ INFO] [1678010486.884112298, 22.373000000]: Ready to take commands for planning group arm. [ INFO] [1678010486.989152895, 22.409000000]: Ready to take commands for planning group arm. ZZW:20221215 ROBOT_ENV INIT FINISHED!!!!! ZZW 20221207 INIT FINISHEDDDDDD. @MOVE.PY ZZW 20221211: ALREADY GOT N_ACTIONS: MOVE.PY ZZW 20221211: N_ACTIONS IN MOVE.PY: 2 /home/walter/miniconda3/envs/Robot_ur5e_ws/lib/python3.7/site-packages/gym/spaces/box.py:74: UserWarning: WARN: Box bound precision lowered by casting to float32 "Box bound precision lowered by casting to {}".format(self.dtype) ZZW 20221207 INIT THE ENV? @MOVE.PY YES,GYM.MAKE, ZZW 20230305 TTTTTHE 2nd ENV IS::::::::: <TimeLimit<UR5eMoveEnv>> ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ TTTTTHE ENV IS::::::::: <TimeLimit<UR5eMoveEnv>> ZZW 20221211: THIS IS THE OUTPUT OF N_ACTIONS IN ur5e_train_all.py: 2 Using cpu device Wrapping the env with a Monitor wrapper Wrapping the env in a DummyVecEnv. 20230304 ZZZZW: START TRAINING [ INFO] [1678010487.492430160, 22.571000000]: "/home/walter/catkin_ws/src/.ur5_robotiq_85_simulation" is a directory containing: [ INFO] [1678010487.492759440, 22.571000000]: # collision objects 0 [ INFO] [1678010487.493115973, 22.571000000]: robot_control_node is ready

qgallouedec commented 1 year ago

It seems there isn’t any error message. What do you mean by « stopped »? What make you think that there is something wrong?

Please make sure to format your code with codeblocks next time

ZZWang21 commented 1 year ago

It seems there isn’t any error message. What do you mean by « stopped »? What make you think that there is something wrong?

Please make sure to format your code with codeblocks next time

Since the simulation model is not moving. And no information in the RL is shown.

I have tried the classic Cartpole example, after the training started, the information of the episodes is shown.

Thanks for your quick reply. Looking forward to your reply.

qgallouedec commented 1 year ago

Can you try to provide a minimal code to reproduce this? Have you set verbose to 1 in the model instanciantion? A2C(..., verbose=1)

ZZWang21 commented 1 year ago

Yes, the verbose is 1.

The main code is:

!/usr/bin/env python3

import rospy import gym import os import numpy as np import pandas as pd import time

from stable_baselines3 import DQN, A2C#, ACKTR, DDPG, PPO1, PPO2, SAC, TRPO, TD3 from stable_baselines3.common.vec_env import VecFrameStack

from openai_ros.task_envs.ur5e_move import move from openai_ros.openai_ros_common import StartOpenAI_ROS_Environment

from gazebo_msgs.srv import GetModelState

os.chdir('/home/walter/catkin_ws/src/ur5e_rl_training/scripts')

rospy.init_node('ur5e_train_all', anonymous=True, log_level=rospy.FATAL)

environment_name = rospy.get_param('/ur5e_v0/task_and_robot_environment_name') print("TTTTTHE 1st ENV IS:::::::::", environment_name) env = StartOpenAI_ROS_Environment(environment_name) print("TTTTTHE 2nd ENV IS:::::::::", env)

n_actions = env.action_space.n print("ZZW 20221211: THIS IS THE OUTPUT OF N_ACTIONS IN ur5e_train_all.py:", n_actions)

model = DQN("MlpPolicy", env, verbose=1, tensorboard_log="/home/walter/catkin_ws/src/cartpole3d_ros/results/tensorboard_logs/DQN/") print("20230304 ZZZZW: START TRAINING")

model.learn(total_timesteps=10)#,progress_bar=True)

done = False

while not done: action, = model.predict(obs) print("ZZW20230305 DURING LEARN:", action, ) obs, reward, done, info = env.step(action) print("goal: ", GOAL, "obs: ", obs, "reward: ", reward, "done: ", done)

print("=--------------------------------------:")

env.close()


For the env.py: the purpose is to make the robotic arm move to reach the ball

!/usr/bin/env python3

from gym import utils from openai_ros.robot_envs import ur5e_env from gym.envs.registration import register from gym import error, spaces

import rospy

import rospy, sys import math import numpy as np from openai_ros.task_envs.task_commons import LoadYamlFileParamsTest from openai_ros.openai_ros_common import ROSLauncher import os

import rospkg from geometry_msgs.msg import Vector3 from trajectory_msgs.msg import JointTrajectory from trajectory_msgs.msg import JointTrajectoryPoint

import moveit_commander

from geometry_msgs.msg import Pose from copy import deepcopy from std_msgs.msg import Header from std_msgs.msg import String from std_msgs.msg import Float64

class UR5eMoveEnv(ur5e_env.UR5eEnv): def init(self):

    ros_ws_abspath = rospy.get_param("/ur5e_v0/ros_ws_abspath", None)
    assert ros_ws_abspath is not None, "You forgot to set ros_ws_abspath in your yaml file of your main RL script. Set ros_ws_abspath: \'YOUR/SIM_WS/PATH\'"
    assert os.path.exists(ros_ws_abspath), "The Simulation ROS Workspace path " + ros_ws_abspath + \
                                           " DOESNT exist, execute: mkdir -p " + ros_ws_abspath + \
                                           "/src;cd " + ros_ws_abspath + ";catkin_make"

    ROSLauncher(rospackage_name="ur5_robotiq_85_simulation",  #ZZW 20221205 launch the world file...
                launch_file_name="start_world.launch",#ORI: start_world.launch
                ros_ws_abspath=ros_ws_abspath)

    # Load Params from the desired Yaml file
    LoadYamlFileParamsTest(rospackage_name="openai_ros",
                           rel_path_from_package_to_file="src/openai_ros/task_envs/ur5e_move/config",
                           yaml_file_name="move.yaml")

    super(UR5eMoveEnv, self).__init__(ros_ws_abspath)
    #super(UR5eMoveEnv, self).__init__(ros_ws_abspath)
    print("ZZW 20221207 INIT FINISHEDDDDDD. @MOVE.PY")

    self.get_params()#zzw 20221214 key

    rospy.loginfo("Start Ur5ev0GetBallEnv INIT...")

    number_actions = rospy.get_param('/ur5e_v0/n_actions')
    self.action_space = spaces.Discrete(number_actions)

    # We set the reward range, which is not compulsory but here we do it.
    self.reward_range = (-np.inf, np.inf)

    self.movement_delta =rospy.get_param("/ur5e_v0/movement_delta")

    self.work_space_x_max = rospy.get_param("/ur5e_v0/work_space/x_max")
    self.work_space_x_min = rospy.get_param("/ur5e_v0/work_space/x_min")
    self.work_space_y_max = rospy.get_param("/ur5e_v0/work_space/y_max")
    self.work_space_y_min = rospy.get_param("/ur5e_v0/work_space/y_min")
    self.work_space_z_max = rospy.get_param("/ur5e_v0/work_space/z_max")
    self.work_space_z_min = rospy.get_param("/ur5e_v0/work_space/z_min")

    self.dec_obs = rospy.get_param("/ur5e_v0/number_decimals_precision_obs")

    self.acceptable_distance_to_ball = rospy.get_param("/ur5e_v0/acceptable_distance_to_ball")

    # We place the Maximum and minimum values of observations
    # TODO: Fill when get_observations is done.

    self.action_space = spaces.Discrete(self.n_actions)
    print("ZZW 20221211: N_ACTIONS IN MOVE.PY:", self.n_actions)#ZZW 20221211
    high = np.array([
        500,#x max
        500,#y max
        self.work_space_z_max,
        1,1,1])
    low = np.array([
        -500,
        -500,
        self.work_space_z_min,
        0,0,0])
    self.observation_space = spaces.Box(low, high)
    print("ZZW 20221207 INIT THE ENV? @MOVE.PY")

    rospy.logdebug("ACTION SPACES TYPE===>"+str(self.action_space))
    rospy.logdebug("OBSERVATION SPACES TYPE===>"+str(self.observation_space))

    # Rewards

    self.done_reward =rospy.get_param("/ur5e_v0/done_reward")
    self.closer_to_block_reward = rospy.get_param("/ur5e_v0/closer_to_block_reward")

    self.cumulated_steps = 0.0

    rospy.loginfo("END ur5e_v0GetBallEnv INIT...")
    # TODO: Remove when working
    """
    ur5e_env.UR5eEnv.__init__(
        self, control_type=self.control_type
    )
    """

    # Here we will add any init functions prior to starting the MyRobotEnv

    #ZZW 20221207: THIS SENTENSE IS KEY

def get_params(self):
    # get configuration parameters
    self.n_actions = rospy.get_param('/ur5e_v0/n_actions')
    print("ZZW 20221211: ALREADY GOT N_ACTIONS: MOVE.PY")
    self.movement_delta =rospy.get_param("/ur5e_v0/movement_delta")
    self.work_space_z_max = rospy.get_param("/ur5e_v0/work_space/z_max")
    self.work_space_z_min = rospy.get_param("/ur5e_v0/work_space/z_min")
    self.dec_obs = rospy.get_param("/ur5e_v0/number_decimals_precision_obs")
    self.acceptable_distance_to_ball = rospy.get_param("/ur5e_v0/acceptable_distance_to_ball")
    # Rewards:        
    self.done_reward =rospy.get_param("/ur5e_v0/done_reward")
    self.closer_to_block_reward = rospy.get_param("/ur5e_v0/closer_to_block_reward")

    self.cumulated_steps = 0.0

def _set_action(self, action):

    # Take action
    #ZZW 20221215 TWO actions, move up and down
    increment_vector = Vector3()
    if action == 0:  # increase Z
        rospy.loginfo("GO UP...")
        increment_vector.z = self.movement_delta
    elif action == 1:  # decrease Z
        rospy.loginfo("GO DOWN...")
        increment_vector.z = -1* self.movement_delta

    self.move_tip(  0,
                    0,
                    z=increment_vector.z) #ZZW 20221215 currently only move along Z-axis.

def _get_obs(self):

    #data = self.joints
    #       base_postion                base_velocity              pole angle                 pole velocity
    #obs = [round(data.position[1],1), round(data.velocity[1],1), round(data.position[0],1), round(data.velocity[0],1)]
     # Initialize the move_group API

    #ZZW 20221217 COPIED FROM SHADOW_TC temporary: 
    rospy.loginfo("Start Get Observation ==>")

    tcp_pose = self.get_tip_pose()

    # We dont add it to the observations because is not part of the robot
    self.ball_pose = self.get_ball_pose()

    # We activate the Finguer collision detection
    self.finger_collided_dict = self.get_fingers_colision(object_collision_name="cricket_ball__link")
    f1_collided = self.finger_collided_dict["f1"]
    f2_collided = self.finger_collided_dict["f2"]
    f3_collided = self.finger_collided_dict["f3"]

    observation = [ round(tcp_pose.position.x,self.dec_obs),
                    round(tcp_pose.position.y,self.dec_obs),
                    round(tcp_pose.position.z,self.dec_obs),
                    int(f1_collided),
                    int(f2_collided),
                    int(f3_collided)
                    ]

    rospy.loginfo("Observations ==>"+str(observation))
    rospy.loginfo("END Get Observation ==>")

    return observation

def _is_done(self, observations):
    tcp_pos = Vector3()
    tcp_pos.x = observations[0]
    tcp_pos.y = observations[1]
    tcp_pos.z = observations[2]

    # We check if all three finguers have collided with the ball
    #finguers_collided = observations[3] and observations[4] and observations[5]

    bool_is_inside_workspace = self.is_inside_workspace(tcp_pos)

    has_reached_the_ball = self.reached_ball(  tcp_pos,
                                                self.ball_pose.position,
                                                self.acceptable_distance_to_ball,
                                                )#finguers_collided

    done = has_reached_the_ball or not(bool_is_inside_workspace)

    rospy.loginfo("#### IS DONE ? ####")
    rospy.loginfo("Not bool_is_inside_workspace ?="+str(not(bool_is_inside_workspace)))
    rospy.loginfo("has_reached_the_ball ?="+str(has_reached_the_ball))
    rospy.loginfo("done ?="+str(done))
    rospy.loginfo("#### #### ####")

    return done

def _compute_reward(self, observations, done):
    """
    Gives more points for staying upright, gets data from given observations to avoid
    having different data than other previous functions
    :return:reward
    """
    rospy.loginfo("START _compute_reward")

    tcp_pos = Vector3()
    tcp_pos.x = observations[0]
    tcp_pos.y = observations[1]
    tcp_pos.z = observations[2]
    # We check if all three finguers have collided with the ball
    finguers_collided = observations[3] and observations[4] and observations[5]

    distance_from_ball = self.get_distance_from_point(self.ball_pose.position, tcp_pos)

    distance_difference =  distance_from_ball - self.previous_distance_from_ball

    if not done:

        # If there has been a decrease in the distance to the desired point, we reward it
        if distance_difference < 0.0:
            rospy.logerr("NOT ERROR: DECREASE IN DISTANCE GOOD")
            reward = self.closer_to_block_reward
        else:
            rospy.logerr("NOT ERROR: ENCREASE IN DISTANCE BAD")
            #reward = -1*self.closer_to_block_reward
            reward = 0.0

    else:

        has_reached_the_ball = self.reached_ball(   tcp_pos,
                                                    self.ball_pose.position,
                                                    self.acceptable_distance_to_ball,
                                                    finguers_collided)

        if has_reached_the_ball:
            reward = self.done_reward
        else:
            reward = -1*self.done_reward

    self.previous_distance_from_ball = distance_from_ball

    rospy.loginfo("reward=" + str(reward))
    self.cumulated_reward += reward
    rospy.loginfo("Cumulated_reward=" + str(self.cumulated_reward))
    self.cumulated_steps += 1
    rospy.loginfo("Cumulated_steps=" + str(self.cumulated_steps))

    return reward

def _init_env_variables(self):
    """
    Inits variables needed to be initialised each time we reset at the start
    of an episode.
    :return:
    """
    rospy.loginfo("START TaskEnv _init_env_variables")
    # For Info Purposes
    self.cumulated_reward = 0.0

    self.ball_pose = self.get_ball_pose()
    tcp_pose = self.get_tip_pose()
    rospy.loginfo("TCP POSE ===>"+str(tcp_pose))
    print("BALL and Arm Position: 20231031",self.ball_pose.position, tcp_pose.position)
    self.previous_distance_from_ball = self.get_distance_from_point(self.ball_pose.position, tcp_pose.position)

    rospy.loginfo("END TaskEnv _init_env_variables")

def _set_init_pose(self):
    """
    Sets joints to initial position [0,0,0]
    :return:
    """

    rospy.loginfo("START _set_init_pose...")
    # We set the angles to zero of the limb
    self.reset_scene()

    rospy.loginfo("END _set_init_pose...")
    return True

# Internal TaskEnv Methods

def reached_ball(self,tcp_position, ball_position, minimum_distance, finguers_collided):
    """
    Return true if the distance from TCP position to the ball position is 
    lower than the minimum_distance and all three finguers are touching the ball.
    """

    distance_from_ball = self.get_distance_from_point(tcp_position, ball_position)

    distance_to_ball_ok = distance_from_ball < minimum_distance

    reached_ball_b = distance_to_ball_ok and finguers_collided

    rospy.loginfo("###### REACHED BLOCK ? ######")
    rospy.loginfo("distance_from_ball==>"+str(distance_from_ball))
    rospy.loginfo("distance_to_ball_ok==>"+str(distance_to_ball_ok))
    rospy.loginfo("reached_ball_b==>"+str(reached_ball_b))
    rospy.loginfo("finguers_collided==>"+str(finguers_collided))
    rospy.loginfo("############")

    return reached_ball_b

def get_distance_from_point(self, pstart, p_end):
    """
    Given a Vector3 Object, get distance from current position
    :param p_end:
    :return:
    """
    a = numpy.array((pstart.x, pstart.y, pstart.z))
    b = numpy.array((p_end.x, p_end.y, p_end.z))

    distance = numpy.linalg.norm(a - b)

    return distance

def is_inside_workspace(self,current_position):
    """
    Check if the shadow_tc is inside the Workspace defined
    """
    is_inside = False

    rospy.loginfo("##### INSIDE WORK SPACE? #######")
    rospy.loginfo("XYZ current_position"+str(current_position))
    rospy.loginfo("work_space_x_max"+str(self.work_space_x_max)+",work_space_x_min="+str(self.work_space_x_min))
    rospy.loginfo("work_space_y_max"+str(self.work_space_y_max)+",work_space_y_min="+str(self.work_space_y_min))
    rospy.loginfo("work_space_z_max"+str(self.work_space_z_max)+",work_space_z_min="+str(self.work_space_z_min))
    rospy.loginfo("############")

    if current_position.x > self.work_space_x_min and current_position.x <= self.work_space_x_max:
        if current_position.y > self.work_space_y_min and current_position.y <= self.work_space_y_max:
            if current_position.z > self.work_space_z_min and current_position.z <= self.work_space_z_max:
                is_inside = True

    return is_inside
qgallouedec commented 1 year ago

Please make sure to format your code with codeblocks next time

what if you remove model.learn?

ZZWang21 commented 1 year ago

Thanks.

I have removed model.learn. but it seems the same. see the attached pic:

image

qgallouedec commented 1 year ago

Can you check on which line the execution stops? While in env.reset or model.predict?

As a general advice, don't use image while asking question. As your are a beginner, I highly recommend you to carefully read https://stackoverflow.com/help/minimal-reproducible-example

ZZWang21 commented 1 year ago

Can you check on which line the execution stops? While in env.reset or model.predict?

As a general advice, don't use image while asking question. As your are a beginner, I highly recommend you to carefully read https://stackoverflow.com/help/minimal-reproducible-example

Hi, thanks for the advice.

The execution stops in env.reset .

ZZWang21 commented 1 year ago

I checked the model, printed on my terminal, compared with the classic cartpole example, it seems similar. I am not sure where is the problem. the model is just not moving. and there is nothing shown on the terminal.

qgallouedec commented 1 year ago

So the problem is not with stable-baselines3. I advise you to ask for help on the RL Discord server. You will probably find people who are familiar with gym integration with ROS.

qgallouedec commented 1 year ago

Have checked the env with sb3 env checker? See https://stable-baselines3.readthedocs.io/en/master/guide/custom_env.html

ZZWang21 commented 1 year ago

Have checked the env with sb3 env checker? See https://stable-baselines3.readthedocs.io/en/master/guide/custom_env.html

I got information like this:

""/home/walter/miniconda3/envs/Robot_ur5e_ws/lib/python3.7/site-packages/stable_baselines3/common/env_checker.py", line 247, in check_env ), "Your environment must inherit from the gym.Env class cf https://github.com/openai/gym/blob/master/gym/core.py" AssertionError: Your environment must inherit from the gym.Env class cf https://github.com/openai/gym/blob/master/gym/core.py "

ZZWang21 commented 1 year ago

I moved the "check_env" just after the env was defined. it shows nothing now, but it stuck there. The following codes are not running.

"env = StartOpenAI_ROS_Environment(environment_name) print("TTTTTHE 2nd ENV IS:::::::::", env) #this line was shown check_env(env) outdir = os.path.dirname(os.path.abspath(file)) env = Monitor(env, outdir) #ZZW 20230306 env = DummyVecEnv([lambda: env]) print("THE ENV_CHECK is over")" #this line was not shown.

qgallouedec commented 1 year ago

Next time, please make sure to use the markdown code blocks for both code and stack traces.

Can you just make your env and reset it without using DummyVecEnv?

env = StartOpenAI_ROS_Environment(environment_name)
check_env(env)
env.reset()
ZZWang21 commented 1 year ago
env.reset()

It cannot make it through the "check_env(env)", stuck again.

My code: "env = StartOpenAI_ROS_Environment(environment_name) print("TTTTTHE 2nd ENV IS:::::::::", env) check_env(env) print("AFTER THE CHECK_ENV", env) env.reset()"

it showed:" TTTTTHE 2nd ENV IS::::::::: <TimeLimit<UR5eMoveEnv>> "

qgallouedec commented 1 year ago

:warning: Once again, please make sure to use the markdown code blocks for both code and stack traces.

Ok, then it's definitely something wrong with your environment. Use debug tools to understand what's going on, and open an issue in the open_ros repository.