Performance bottlenecks in SMARTS core

Adaickalavan commented 3 years ago

Profiling the SMARTS code while running “python3.7 ./examples/single_agent.py ./scenarios/straight --headless” code returns the following:

In summary, the top 10 time-consuming operations include: (1) _pybullet_provider_step, (2) select.poll, (3) waypoints.py, (4) sklearn.neighbors._kd_tree.BinaryTree, (5) coordinates.py, (6) interpolate.py, (7) chassis.py, (8) _socket.socket.recv, (9) builtins.isinstance, and (10) math.py.

The profiler and the results are being analysed in the feature branch https://github.com/huawei-noah/SMARTS/tree/code-profiling .

Adaickalavan commented 3 years ago

Consider the biggest culprit, namely: _pybullet_provider_step. We track the issue causing the code to slow down beginning from _pybullet_provider_step function.

1) File: smarts/core/smarts.py . Function: _pybullet_provider_step

2) File: smarts/core/smarts.py . Function: _perform_agent_actions

3) File: smarts/core/controllers/__init__.py . Function: perform_action

4) File: smarts/core/controllers/lane_following_controller.py . Function: perform_lane_following

Now, there are two key time consuming processes, namely:

sensor_state.mission_planner.waypoint_paths_at() - let’s call this branch A
LaneFollowingController._update_target_lane_if_reached_end_of_lane() - lets call this branch B.

We continue tracking down the branch A.

A-1) File: smarts/core/mission_planner.py . Function: waypoint_paths_at

We continue tracking down the branch B.

B-1) File: smarts/core/controllers/lane_following_controller.py . Function: _update_target_lane_if_reached_end_of_lane

B-2) File: smarts/core/mission_planner.py . Function: waypoint_paths_on_lane_at

Both branch A and branch B converges to the function smarts/core/waypoints.py::Waypoints.waypoint_paths_at

Adaickalavan commented 3 years ago

https://github.com/huawei-noah/SMARTS/blob/180769e49d6cfda7bb9742fefef14b13fac10869/smarts/core/smarts.py#L818-L837

@iman512003 and @Gamenot Is the content of this for loop independent of the iteration, such that the content of this for loop can be performed in parallel for all vehicle in agent_vehicles ?

There is no significant speed up by replacing either the inner or the outer for loop with a map operation, because len(agent_actions) and len(agent_vehicles) mostly equals 1. Commit: https://github.com/huawei-noah/SMARTS/commit/7bef836220d5ad328ef3acf3b9af2ba97353f126

def _perform_agent_actions(self, agent_actions):
  for agent_id, action in agent_actions.items():
    agent_vehicles = self._vehicle_index.vehicles_by_actor_id(agent_id)
    if len(agent_vehicles) == 0:
      self._log.warning(
          f"{agent_id} doesn't have a vehicle, is the agent done? (dropping action)"
      )
    else:
      agent_interface = self._agent_manager.agent_interface_for_agent_id(
          agent_id
      )
      is_boid_agent = self._agent_manager.is_boid_agent(agent_id)

      def par_controller_action(vehicle):
        vehicle_action = action[vehicle.id] if is_boid_agent else action
        controller_state = (
            self._vehicle_index.controller_state_for_vehicle_id(vehicle.id)
        )
        sensor_state = self._vehicle_index.sensor_state_for_vehicle_id(
            vehicle.id
        )
        Controllers.perform_action(
            self,
            agent_id,
            vehicle,
            vehicle_action,
            controller_state,
            sensor_state,
            agent_interface.action_space,
            agent_interface.vehicle_type,
        )

      list(map(par_controller_action, agent_vehicles))

Gamenot commented 3 years ago

@Adaickalavan @sah-huawei Please split this off into a number of separate issues and add to 0.4.15.

sah-huawei commented 3 years ago

Some waypoint optimizations were done with pull request #670 (under review).

From flame graphs, it appears that the act() method of remote_agent.py (sending observations to the remote agent via cloud pickle) is a prime candidate for optimization and should probably be added to our list too.

sah-huawei commented 3 years ago

@Adaickalavan @sah-huawei Please split this off into a number of separate issues and add to 0.4.15.

I just split out issues #679 and #680.

huawei-noah / SMARTS

Performance bottlenecks in SMARTS core #641