RIVeR-Lab / tentabot

Tentabot: Navigation Framework for Mobile Robots by Evaluating Motion Primitives (Tentacles)
44 stars 21 forks source link

Errors in creating a training environment #2

Closed AbelSyx closed 1 year ago

AbelSyx commented 1 year ago

Hi, @akmandor I admire your work very much, but when I use the tentabot_framework.launch file to train with Turtlebot3 configuration files, some errors have appeared, as follows:

ROS_MASTER_URI=http://localhost:11311
process[tentabot_server-9]: started with pid [12818]
tentabot_framework_launch:: __main__ -> Launched Tentabot Server!
tentabot_server::main -> world_frame_name: world
[ WARN] [1677553132.884454270]: couldn't register subscriber on topic [/]
tentabot_server::main -> n_goal: 1
Welcome to Tentabot Navigation Simulation! I hope you'll enjoy the experience...
Sorting tentacles by y-axis...
Tentabot::initialize -> Completed!
tentabot_framework_launch:: __main__ -> drl_service_flag: True
tentabot_framework_launch:: __main__ -> mode: training
started roslaunch server http://server-System-Product-Name:40839/

SUMMARY
========

PARAMETERS
 * /rosdistro: noetic
 * /rosversion: 1.15.15

NODES
  /
    tentabot_drl_training (tentabot/tentabot_drl_training.py)

ROS_MASTER_URI=http://localhost:11311
process[tentabot_drl_training-10]: started with pid [12837]
tentabot_framework_launch:: __main__ -> Launched Tentabot-DRL: training!
tentabot_drl_training::__main__ -> mode: training
tentabot_drl_training::__main__ -> deep_learning_algorithm: PPO
tentabot_drl_training::__main__ -> motion_planning_algorithm: tentabot_drl
tentabot_drl_training::__main__ -> observation_space_type: Tentabot_1DCNN_FC
tentabot_drl_training::__main__ -> world_name: training_garden_static_1
tentabot_drl_training::__main__ -> task_and_robot_environment_name: TurtleBot3tentabot_drl-v001
tentabot_drl_training::__main__ -> n_robot: 1
tentabot_drl_training::__main__ -> data_path: dataset/drl/testing/turtlebot3/
tentabot_drl_training::__main__ -> learning_rate: 0.0002
tentabot_drl_training::__main__ -> n_steps: 1000
tentabot_drl_training::__main__ -> batch_size: 50
tentabot_drl_training::__main__ -> ent_coef: 0.001
tentabot_drl_training::__main__ -> training_timesteps: 2000
tentabot_drl_training::__main__ -> max_episode_steps: 2000
tentabot_drl_training::__main__ -> initial_training_path: 
tentabot_drl_training::__main__ -> training_checkpoint_freq: 1000
tentabot_drl_training::__main__ -> plot_title: Learning Curve
tentabot_drl_training::__main__ -> plot_moving_average_window_size_timesteps: 20
tentabot_drl_training::__main__ -> plot_moving_average_window_size_episodes: 5
tentabot_drl_training::write_data -> Data is written in /home/server/workspace/tentabot_ws/src/tentabot/dataset/drl/testing/turtlebot3/20230228_105856_PPO_tentabot/training_log.csv
[WARN] [1677553136.405351, 6933.280000]: Env: TurtleBot3tentabot_drl-v001 will be imported
[WARN] [1677553136.406210, 6933.281000]: Something Went wrong in the register
Traceback (most recent call last):
  File "/home/server/workspace/tentabot_ws/src/tentabot/scripts/tentabot_drl/tentabot_drl_training.py", line 194, in <module>
    env = Monitor(env, data_folder_path)
  File "/home/server/python_env/tentabot/lib/python3.8/site-packages/stable_baselines3/common/monitor.py", line 47, in __init__
    header={"t_start": self.t_start, "env_id": env.spec and env.spec.id},
AttributeError: 'NoneType' object has no attribute 'spec'
[tentabot_drl_training-10] process has died [pid 12837, exit code 1, cmd /home/server/workspace/tentabot_ws/src/tentabot/scripts/tentabot_drl/tentabot_drl_training.py __name:=tentabot_drl_training __log:=/home/server/.ros/log/d18d61da-b713-11ed-9a76-d573bd68246a/tentabot_drl_training-10.log].
log file: /home/server/.ros/log/d18d61da-b713-11ed-9a76-d573bd68246a/tentabot_drl_training-10*.log
all processes on machine have died, roslaunch will exit

I wonder if this is caused by the gym version? Or is there a problem with my configuration file? Looking forward to your reply, thank you!

akmandor commented 1 year ago

Hi @AbelSyx! First of all, thank you for your interest in our work.

To check if there is something broken in the current implementation, I created a new catkin workspace on my computer (Ubuntu 20.04, ROS Noetic) and installed all required packages by following the MANUAL_INSTALLATION. I tested both heuristic navigation (which is default) and drl training. Both of them are working on my side!

To run the drl training from the default configuration, you need to update the configuration file in config_tentabot_server_turtlebot3.yaml and set these values:

Since you pointed out, let me also provide the version information of some important Python packages that I tested:

I think you also asked about the trajectory data. It is under the dataset/trajectory_sampling folder. You can set its location as trajectory_data_path: "dataset/trajectory_sampling/turtlebot3/20220323_142544/" in the config_tentabot_server_turtlebot3 configuration file. If it is not set, then the program creates it automatically, based on the user-defined "## Trajectory Sampling Parameters".

AbelSyx commented 1 year ago

Thank you for your reply. I have solved this problem. I found that this problem was caused by setting the name of task_and_robot_environment_name incorrectly. Now I'm adding lidar and realsense models to our robot model in gazebo to generate trajectory and train. Thank you again!