MingfeiCheng / BehAVExplor

BehAVExplor: Behavior Diversity Guided Testing for Autonomous Driving Systems (ISSTA 2023)
22 stars 3 forks source link

Freezed after around 2 hours #9

Closed YuqiHuai closed 6 months ago

YuqiHuai commented 6 months ago

Hi Mingfei,

After about 2 hours of running scenario 1, it got frozen. I attached the log from terminal below. Since SVL discontinued, I was using the SVL client you provided, and connected to a local cloud.

2024-03-16 05:13:48.237 | INFO     | common.simulator:load_map:64 - [Simulator] Loaded map: SanFrancisco_correct
2024-03-16 05:13:51.918 | INFO     | common.simulator:init_environment:246 - Load environment - Finish
2024-03-16 05:13:52.096 | INFO     | lgsvl.dreamview.dreamview:setup_apollo:314 - {'Camera': False, 'Canbus': False, 'Control': True, 'GPS': False, 'Guardian': False, 'Localization': True, 'Perception': False, 'Planning': True, 'Prediction': True, 'Radar': False, 'Recorder': False, 'Routing': True, 'Storytelling': True, 'Third Party Perception': False, 'Traffic Light': False, 'Transform': True, 'Velodyne': False}
2024-03-16 05:13:52.099 | INFO     | lgsvl.dreamview.dreamview:enable_apollo:283 - Starting Localization module...
2024-03-16 05:13:52.099 | INFO     | lgsvl.dreamview.dreamview:enable_apollo:283 - Starting Transform module...
2024-03-16 05:13:52.099 | INFO     | lgsvl.dreamview.dreamview:enable_apollo:283 - Starting Routing module...
2024-03-16 05:13:52.099 | INFO     | lgsvl.dreamview.dreamview:enable_apollo:283 - Starting Prediction module...
2024-03-16 05:13:52.100 | INFO     | lgsvl.dreamview.dreamview:enable_apollo:283 - Starting Planning module...
2024-03-16 05:13:52.100 | INFO     | lgsvl.dreamview.dreamview:enable_apollo:283 - Starting Control module...
2024-03-16 05:13:52.100 | INFO     | lgsvl.dreamview.dreamview:enable_apollo:283 - Starting Storytelling module...
2024-03-16 05:13:52.152 | INFO     | lgsvl.dreamview.dreamview:on_control_received:332 - Control message received
2024-03-16 05:13:54.193 | INFO     | common.simulator:run:316 - [Simulator] Set Apollo (EGO) destination: -435.52519353421843,410.3606234656769
nohup: appending output to 'nohup.out'
2024-03-16 05:13:57.365 | INFO     | lgsvl.simulator:run_custom:114 - [PythonAPI] simulator.run_custom
2024-03-16 05:13:57.365 | INFO     | lgsvl.remote:command_run:118 - [PythonAPI] Start Running
2024-03-16 05:14:28.533 | INFO     | common.simulator:run:457 - simulation finished, total frames: 301
2024-03-16 05:14:33.210 | INFO     | common.simulator:run:477 - [Simulator] Restart all simulator modules in case high delays.

Have you seen this issue before or is this likely a SVL issue due to not using official cloud?

YuqiHuai commented 6 months ago

I just terminated the process after submitting this issue and noticed the Traceback being

  File "/apollo/./bazel-bin/BehAVExplor/main.runfiles/apollo/BehAVExplor/main.py", line 150, in <module>
    fuzzer.loop(int(params['total_test_time'])) # seconds
  File "/apollo/./bazel-bin/BehAVExplor/main.runfiles/apollo/BehAVExplor/main.py", line 119, in loop
    scenario_recorder, scenario_id = runner.run(scenario_obj)
  File "/apollo/BehAVExplor/common/runner.py", line 52, in run
    sim_recorder = self.sim.run(scenario_obj, scenario_id, self.record_apollo_path)
  File "/apollo/BehAVExplor/common/simulator.py", line 478, in run
    utils.close_modules(dv, self.modules)
  File "/apollo/BehAVExplor/common/utils.py", line 13, in close_modules
    module_status = dv.get_module_status()
  File "/home/yuqi/.local/lib/python3.6/site-packages/lgsvl/dreamview/dreamview.py", line 221, in get_module_status
    self.ws.recv()

So it is actually Dreamview's websocket being broken and I have seen this before when we frequently communicate with Dreamview over socket. Is BehAVExplor expecting me to manually restart Dreamview when this problem happens?

MingfeiCheng commented 6 months ago

Hi Yuqi,

Sorry, I dont recall facing a similar situation before. So, I am not sure how to solve this issue effectively. Restart maybe a good solution. Thanks.

YuqiHuai commented 6 months ago

Hi Mingfei,

I ran BehAVExplor again and this time it seems to move on smoothly. However, after a few hours, I can no longer access Dreamview on localhost:8888, and BehAVExplor is still running.

See screenshot Screenshot from 2024-03-18 10-06-24

I've reported this to Apollo before but I cannot provide enough information for them to debug the issue. https://github.com/ApolloAuto/apollo/issues/13134#issuecomment-1195865636

Dreamview's log suggests its backend is still working

I0318 10:03:33.893414 4047347 simulation_world_updater.cc:656] Constructed RoutingRequest to be sent:
waypoint {
  id: "lane_477"
  s: 18.651878762971553
  pose {
    x: 593241.49687995552
    y: 4135030.957659022
  }
}
waypoint {
  id: "lane_570"
  s: 59.999857800182347
  pose {
    x: 593130.360748291
    y: 4134914.525177
  }
}
W0318 10:03:36.611380 4047413 rate.cc:96] Detect forward jumps in time
I0318 10:03:38.099026 4047342 simulation_world_service.h:240] Has not received any data from /apollo/audio_detection
W0318 10:03:40.798460 4047413 rate.cc:96] Detect forward jumps in time

So likely this is an issue of Apollo's Dreamview and I'll close this issue.

looles commented 1 week ago

@YuqiHuai Hi, I'm having a similar problem, I get the following error when I first run it, can you help me out?image

YuqiHuai commented 1 week ago

When you first run it? Did you compile an Apollo first? This looks like you have not compiled it, or you compiled it under root but not regular user (inside docker).

looles commented 1 week ago
  1. I am running it for the first time; 2. Apollo is already compiled; 3. I retried it again with a normal user and it doesn't work; image image image image Can you determine where I am having problems with the steps based on the picture above?
YuqiHuai commented 1 week ago

Nope, cannot determine the issue yet. When you enter the container, can you run ‘cyber_recorder’?

looles commented 1 week ago

Hi, I have installed "cyber_recorder" as per the tutorial, but it still doesn't work.

looles commented 1 week ago

Hi, I have installed "cyber_recorder" as per the tutorial, but it still doesn't work. ![Uploading image.png…]()

![Uploading image.png…]()