huawei-noah / SMARTS

Scalable Multi-Agent RL Training School for Autonomous Driving
MIT License
922 stars 186 forks source link

Segmentation fault #2097

Closed knightcalvert closed 10 months ago

knightcalvert commented 10 months ago

High Level Description

I tried to appled the MyPARL Agrithom to SMARTS, but Segmentation fault occured. I don't know where the bug is and what i can do to fix it

Version

1.4.0 the latest main branch

Operating System

windows 10 , wsl 2 , ubuntu 20

Problems

(.mycode) root@DESKTOP-M65JMJ7:/mnt/d/xudu/mycode# python pymarl_smarts/src/main.py --config=qmix --env-config=smarts [DEBUG 16:16:55] git.cmd Popen(['git', 'version'], cwd=/mnt/d/xudu/mycode, universal_newlines=False, shell=None, istream=None) [DEBUG 16:16:55] git.cmd Popen(['git', 'version'], cwd=/mnt/d/xudu/mycode, universal_newlines=False, shell=None, istream=None) [DEBUG 16:16:55] git.cmd Popen(['git', 'diff', '--cached', '--abbrev=40', '--full-index', '--raw'], cwd=/mnt/d/xudu/mycode/pymarl_smarts, universal_newlines=False, shell=None, istream=None) [DEBUG 16:16:55] git.cmd Popen(['git', 'diff', '--abbrev=40', '--full-index', '--raw'], cwd=/mnt/d/xudu/mycode/pymarl_smarts, universal_newlines=False, shell=None, istream=None) [DEBUG 16:16:56] git.cmd Popen(['git', 'cat-file', '--batch-check'], cwd=/mnt/d/xudu/mycode/pymarl_smarts, universal_newlines=False, shell=None, istream=) [DEBUG 16:16:56] git.cmd Popen(['git', 'diff', '--cached', '--abbrev=40', '--full-index', '--raw'], cwd=/mnt/d/xudu/mycode/pymarl_smarts, universal_newlines=False, shell=None, istream=None) [DEBUG 16:16:56] git.cmd Popen(['git', 'diff', '--abbrev=40', '--full-index', '--raw'], cwd=/mnt/d/xudu/mycode/pymarl_smarts, universal_newlines=False, shell=None, istream=None) [DEBUG 16:16:56] git.cmd Popen(['git', 'cat-file', '--batch-check'], cwd=/mnt/d/xudu/mycode/pymarl_smarts, universal_newlines=False, shell=None, istream=) [DEBUG 16:16:56] git.cmd Popen(['git', 'diff', '--cached', '--abbrev=40', '--full-index', '--raw'], cwd=/mnt/d/xudu/mycode/pymarl_smarts, universal_newlines=False, shell=None, istream=None) [DEBUG 16:16:56] git.cmd Popen(['git', 'diff', '--abbrev=40', '--full-index', '--raw'], cwd=/mnt/d/xudu/mycode/pymarl_smarts, universal_newlines=False, shell=None, istream=None) [DEBUG 16:16:56] git.cmd Popen(['git', 'cat-file', '--batch-check'], cwd=/mnt/d/xudu/mycode/pymarl_smarts, universal_newlines=False, shell=None, istream=) pymarl_smarts/src/main.py:80: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details. config_dict = yaml.load(f) pymarl_smarts/src/main.py:58: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3, and in 3.10 it will stop working if isinstance(v, collections.Mapping): [INFO 16:16:57] root Saving to FileStorageObserver in results/sacred. /mnt/d/xudu/mycode/.mycode/lib/python3.8/site-packages/sacred/observers/file_storage.py:29: DeprecationWarning: FileStorageObserver.create(...) is deprecated. Please use FileStorageObserver(...) instead. warnings.warn( [DEBUG 16:16:57] pymarl Using capture mode "no" [INFO 16:16:57] pymarl Running command 'my_main' [INFO 16:16:58] pymarl Started run with ID "94" [DEBUG 16:16:58] pymarl Starting Heartbeat [DEBUG 16:16:58] my_main Started [INFO 16:16:58] my_main Experiment Parameters: [INFO 16:16:58] my_main

{ 'action_selector': 'epsilon_greedy', 'agent': 'rnn', 'agent_output_type': 'q', 'batch_size': 32, 'batch_size_run': 1, 'buffer_cpu_only': True, 'buffer_size': 5000, 'checkpoint_path': '', 'critic_lr': 0.0005, 'double_q': True, 'env': 'smarts', 'env_args': { 'agent_num': 2, 'continuing_episode': False, 'episode_limit': 10, 'headless': True, 'scenarios': '/mnt/d/xudu/mycode/SMARTS/scenarios/sumo/intersections/4lane', 'seed': 850398191}, 'epsilon_anneal_time': 50000, 'epsilon_finish': 0.05, 'epsilon_start': 1.0, 'evaluate': False, 'gamma': 0.99, 'grad_norm_clip': 10, 'hypernet_embed': 64, 'hypernet_layers': 2, 'label': 'default_label', 'learner': 'q_learner', 'learner_log_interval': 100, 'load_step': 0, 'local_results_path': 'results', 'log_interval': 100, 'lr': 0.0005, 'mac': 'basic_mac', 'mixer': 'qmix', 'mixing_embed_dim': 32, 'name': 'qmix', 'obs_agent_id': True, 'obs_last_action': True, 'optim_alpha': 0.99, 'optim_eps': 1e-05, 'repeat_id': 1, 'rnn_hidden_dim': 64, 'runner': 'episode', 'runner_log_interval': 100, 'save_model': False, 'save_model_interval': 2000000, 'save_replay': False, 'seed': 850398191, 't_max': 20500, 'target_update_interval': 200, 'test_greedy': True, 'test_interval': 100, 'test_nepisode': 32, 'use_cuda': True, 'use_tensorboard': False}

{'agent_num': 2, 'continuing_episode': False, 'scenarios': '/mnt/d/xudu/mycode/SMARTS/scenarios/sumo/intersections/4lane', 'episode_limit': 10, 'seed': 850398191, 'headless': True} [INFO 16:17:04] smarts.core.configuration Using engine configuration from: /mnt/d/xudu/mycode/SMARTS/smarts/engine.ini [DEBUG 16:17:04] AgentManager Tearing down AgentManager [DEBUG 16:17:04] SumoTrafficSimulation Tearing down SUMO traffic sim SumoTrafficSim( _scenario=None, _time_resolution=0.1, _headless=True, _cumulative_sim_seconds=0, _non_sumo_vehicle_ids=set(), _sumo_vehicle_ids=set(), _is_setup=False, _last_trigger_time=-1000000, _num_dynamic_ids_used=0, _traci_conn=None ) [DEBUG 16:17:04] SumoTrafficSimulation Nothing to teardown [DEBUG 16:17:05] SumoTrafficSimulation Setting up SumoTrafficSim SumoTrafficSim( _scenario=None, _time_resolution=0.1, _headless=True, _cumulative_sim_seconds=0, _non_sumo_vehicle_ids=set(), _sumo_vehicle_ids=set(), _is_setup=False, _last_trigger_time=-1000000, _num_dynamic_ids_used=0, _traci_conn=None ) [DEBUG 16:17:05] root Starting sumo process: ['/mnt/d/xudu/mycode/.mycode/lib/python3.8/site-packages/sumo/bin/sumo', '--remote-port=41219', '--num-clients=1', '--net-file=/mnt/d/xudu/mycode/SMARTS/scenarios/sumo/intersections/4lane/map.net.xml', '--quit-on-end', '--log=/root/.smarts/_sumo_run_logs/sumo-a7db2483', '--error-log=/root/.smarts/_sumo_run_logs/sumo-a7db2483', '--no-step-log', '--no-warnings=1', '--seed=856418768', '--time-to-teleport=-1', '--collision.check-junctions=true', '--collision.action=none', '--lanechange.duration=3.0', '--step-length=0.100000', '--default.action-step-length=0.100000', '--begin=0', '--end=31536000', '--start'] Retrying in 0.05 seconds Could not connect to TraCI server at localhost:41219 [Errno 111] Connection refused Retrying in 0.05 seconds Could not connect to TraCI server at localhost:41219 [Errno 111] Connection refused Retrying in 0.05 seconds Could not connect to TraCI server at localhost:41219 [Errno 111] Connection refused Retrying in 0.05 seconds Could not connect to TraCI server at localhost:41219 [Errno 111] Connection refused Retrying in 0.05 seconds Could not connect to TraCI server at localhost:41219 [Errno 111] Connection refused Retrying in 0.05 seconds Could not connect to TraCI server at localhost:41219 [Errno 111] Connection refused Retrying in 0.05 seconds Could not connect to TraCI server at localhost:41219 [Errno 111] Connection refused Retrying in 0.05 seconds Could not connect to TraCI server at localhost:41219 [Errno 111] Connection refused Retrying in 0.05 seconds Could not connect to TraCI server at localhost:41219 [Errno 111] Connection refused Retrying in 0.05 seconds Could not connect to TraCI server at localhost:41219 [Errno 111] Connection refused Retrying in 0.05 seconds Could not connect to TraCI server at localhost:41219 [Errno 111] Connection refused Retrying in 0.05 seconds Could not connect to TraCI server at localhost:41219 [Errno 111] Connection refused Retrying in 0.05 seconds Could not connect to TraCI server at localhost:41219 [Errno 111] Connection refused Retrying in 0.05 seconds Could not connect to TraCI server at localhost:41219 [Errno 111] Connection refused Retrying in 0.05 seconds Could not connect to TraCI server at localhost:41219 [Errno 111] Connection refused Retrying in 0.05 seconds [DEBUG 16:17:05] SumoTrafficSimulation Finished starting sumo process [DEBUG 16:17:05] VehicleIndex Tearing down vehicle ids: set() [DEBUG 16:17:06] smarts.core.sensors.local_sensor_resolver "serial run" took: 0.007391ms [DEBUG 16:17:06] smarts.core.sensors.local_sensor_resolver "rendering" took: 0.000954ms [DEBUG 16:17:06] smarts.core.sensors.local_sensor_resolver "merging observations" took: 0.000954ms [DEBUG 16:17:06] VehicleIndex Tearing down vehicle ids: set() [DEBUG 16:17:06] smarts.core.sensors.local_sensor_resolver "serial run" took: 0.008106ms [DEBUG 16:17:06] smarts.core.sensors.local_sensor_resolver "rendering" took: 0.002623ms [DEBUG 16:17:06] smarts.core.sensors.local_sensor_resolver "merging observations" took: 0.003338ms [DEBUG 16:17:06] VehicleIndex Tearing down vehicle ids: set() [DEBUG 16:17:06] VehicleIndex Tearing down vehicle ids: set() [DEBUG 16:17:06] Vehicle replacing existing trip_meter_sensor on vehicle Agent 1 [DEBUG 16:17:06] Vehicle replacing existing driven_path_sensor on vehicle Agent 1 [DEBUG 16:17:06] Vehicle replacing existing accelerometer_sensor on vehicle Agent 1 [DEBUG 16:17:06] Vehicle replacing existing lane_position_sensor on vehicle Agent 1 [DEBUG 16:17:06] Vehicle replacing existing road_waypoints_sensor on vehicle Agent 1 :display:x11display(error): Could not open display "172.20.224.1:0". Attempt to register type x11GraphicsPipe more than once! Attempt to register type x11GraphicsWindow more than once! :display:x11display(error): Could not open display "172.20.224.1:0". Fatal Python error: Segmentation fault

Thread 0x00007fe3f4fb9700 (most recent call first): File "/mnt/d/xudu/mycode/.mycode/lib/python3.8/site-packages/sacred/observers/file_storage.py", line 219 in save_json File "/mnt/d/xudu/mycode/.mycode/lib/python3.8/site-packages/sacred/observers/file_storage.py", line 328 in log_metrics File "/mnt/d/xudu/mycode/.mycode/lib/python3.8/site-packages/sacred/run.py", line 420 in _safe_call File "/mnt/d/xudu/mycode/.mycode/lib/python3.8/site-packages/sacred/run.py", line 351 in _emit_heartbeat File "/mnt/d/xudu/mycode/.mycode/lib/python3.8/site-packages/sacred/utils.py", line 736 in run File "/usr/lib/python3.8/threading.py", line 932 in _bootstrap_inner File "/usr/lib/python3.8/threading.py", line 890 in _bootstrap

Current thread 0x00007fe430a6f740 (most recent call first): File "/mnt/d/xudu/mycode/.mycode/lib/python3.8/site-packages/direct/showbase/ShowBase.py", line 700 in makeAllPipes File "/mnt/d/xudu/mycode/.mycode/lib/python3.8/site-packages/direct/showbase/ShowBase.py", line 782 in openWindow File "/mnt/d/xudu/mycode/.mycode/lib/python3.8/site-packages/direct/showbase/ShowBase.py", line 1061 in openMainWindow File "/mnt/d/xudu/mycode/.mycode/lib/python3.8/site-packages/direct/showbase/ShowBase.py", line 1026 in openDefaultWindow File "/mnt/d/xudu/mycode/.mycode/lib/python3.8/site-packages/direct/showbase/ShowBase.py", line 341 in init File "/mnt/d/xudu/mycode/SMARTS/smarts/p3d/renderer.py", line 132 in init File "/mnt/d/xudu/mycode/SMARTS/smarts/p3d/renderer.py", line 121 in new File "/mnt/d/xudu/mycode/SMARTS/smarts/p3d/renderer.py", line 294 in init File "/mnt/d/xudu/mycode/SMARTS/smarts/core/smarts.py", line 997 in renderer File "/mnt/d/xudu/mycode/SMARTS/smarts/core/vehicle.py", line 412 in attach_sensors_to_vehicle File "/mnt/d/xudu/mycode/SMARTS/smarts/core/vehicle_index.py", line 719 in _enfranchise_agent File "/mnt/d/xudu/mycode/SMARTS/smarts/core/utils/cache.py", line 136 in wrapper File "/mnt/d/xudu/mycode/SMARTS/smarts/core/vehicle_index.py", line 691 in build_agent_vehicle File "/mnt/d/xudu/mycode/SMARTS/smarts/core/actor_capture_manager.py", line 66 in _make_new_vehicle File "/mnt/d/xudu/mycode/SMARTS/smarts/core/trap_manager.py", line 322 in step File "/mnt/d/xudu/mycode/SMARTS/smarts/core/smarts.py", line 338 in _step File "/mnt/d/xudu/mycode/SMARTS/smarts/core/smarts.py", line 270 in step File "/mnt/d/xudu/mycode/SMARTS/smarts/core/smarts.py", line 527 in _reset File "/mnt/d/xudu/mycode/SMARTS/smarts/core/smarts.py", line 464 in reset File "/mnt/d/xudu/mycode/SMARTS/smarts/env/gymnasium/hiway_env_v1.py", line 348 in reset File "/mnt/d/xudu/mycode/.mycode/lib/python3.8/site-packages/gymnasium/wrappers/order_enforcing.py", line 42 in reset File "/mnt/d/xudu/mycode/pymarl_smarts/src/envs/smarts_env.py", line 338 in init File "/mnt/d/xudu/mycode/pymarl_smarts/src/envs/init.py", line 8 in env_fn File "/mnt/d/xudu/mycode/pymarl_smarts/src/runners/episode_runner.py", line 15 in init File "/mnt/d/xudu/mycode/pymarl_smarts/src/run.py", line 79 in run_sequential File "/mnt/d/xudu/mycode/pymarl_smarts/src/run.py", line 48 in run File "pymarl_smarts/src/main.py", line 36 in my_main File "/mnt/d/xudu/mycode/.mycode/lib/python3.8/site-packages/sacred/config/captured_function.py", line 42 in captured_function File "/mnt/d/xudu/mycode/.mycode/lib/python3.8/site-packages/sacred/run.py", line 238 in call File "/mnt/d/xudu/mycode/.mycode/lib/python3.8/site-packages/sacred/experiment.py", line 276 in run File "/mnt/d/xudu/mycode/.mycode/lib/python3.8/site-packages/sacred/experiment.py", line 312 in run_commandline File "pymarl_smarts/src/main.py", line 99 in Segmentation fault

Gamenot commented 10 months ago

Hello @knightcalvert, apologies for the late reply, it looks like you are using threading.

Unfortunately, we do not currently support threading (same process memory access.) If you want parallelised code you will need to use the python multiprocessing or some other multi-process system like ray.

knightcalvert commented 10 months ago

thank you for your help, i change the runner from episode to parallel, and this problem seems disappeared. the threading may come from the code import faulthandler; faulthandler.enable() which i used for check which line has Segmentation fault?