NVIDIA-ISAAC-ROS / isaac_ros_compression

NVIDIA-accelerated data compression
https://developer.nvidia.com/isaac-ros-gems
Apache License 2.0
53 stars 7 forks source link

High CPU load while encoding #8

Open jtaveau opened 10 months ago

jtaveau commented 10 months ago

Hi, I tried running the encoder on a Jetson AGX Orin 64GB, it's working, but CPU load is super high, am I doing something wrong with my config file? I compiled with -DCMAKE_BUILD_TYPE=Release.

Camera + debayering is taking around 15%, if I add the encoding to the component container, it goes to 160% on htop.

Launch file:

# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
#

from launch_ros.substitutions import FindPackageShare
from launch_ros.actions import ComposableNodeContainer
from launch.substitutions import LaunchConfiguration
from launch.substitutions import PathJoinSubstitution
from launch.actions import DeclareLaunchArgument
from launch import LaunchDescription
from launch_ros.descriptions import ComposableNode

example_parameters = {
    'blackfly_s': {
        'debug': False,
        'compute_brightness': False,
        'adjust_timestamp': True,
        'dump_node_map': False,
        # set parameters defined in blackfly_s.yaml
        'gain_auto': 'Continuous',
        'pixel_format': 'BayerRG8', #BGR8
        'exposure_auto': 'Continuous',
        # These are useful for GigE cameras
        # 'device_link_throughput_limit': 380000000,
        # 'gev_scps_packet_size': 9000,
        # ---- to reduce the sensor width and shift the crop
        'image_width': 1408,
        'image_height': 1024,
        # 'offset_x': 16,
        # 'offset_y': 0,
        'frame_rate_auto': 'Off',
        'frame_rate': 10.0,
        'frame_rate_enable': True,
        'buffer_queue_size': 1,
        'trigger_mode': 'Off',
        'chunk_mode_active': True,
        'chunk_selector_frame_id': 'FrameID',
        'chunk_enable_frame_id': True,
        'chunk_selector_exposure_time': 'ExposureTime',
        'chunk_enable_exposure_time': True,
        'chunk_selector_gain': 'Gain',
        'chunk_enable_gain': True,
        'chunk_selector_timestamp': 'Timestamp',
        'chunk_enable_timestamp': True}
}

def generate_launch_description():
    launch_args = [
        DeclareLaunchArgument(
            'camera_name', 
            default_value=['flir_camera'],
            description='camera name (ros node name)'),
        DeclareLaunchArgument(
            'serial', 
            default_value="'20435008'",
            description='FLIR serial number of camera (in quotes!!)'),
        DeclareLaunchArgument(
            'input_height',
            default_value='1024',
            description='Height of the original image'),
        DeclareLaunchArgument(
            'input_width',
            default_value='1408',
            description='Width of the original image'),
    ]

    camera_name = LaunchConfiguration('camera_name')
    serial = LaunchConfiguration('serial')
    parameter_file = ''
    camera_type = 'blackfly_s'

    if not parameter_file:
        parameter_file = PathJoinSubstitution(
            [FindPackageShare('spinnaker_camera_driver'), 'config',
             camera_type + '.yaml'])

    flir_camera_node = ComposableNode(
        name=[camera_name],
        package='spinnaker_camera_driver',
        plugin='spinnaker_camera_driver::CameraDriver',
        parameters=[example_parameters[camera_type],
                    {'parameter_file': parameter_file,
                     'serial_number': serial}],
        remappings=[('~/control', '/exposure_control/control'), ]
    )

    debayer_node = ComposableNode(
        name='debayer',
        package='image_proc',
        plugin='image_proc::DebayerNode',
        remappings=[
            ('image_raw', 'flir_camera/image_raw'),
            ('image_color', 'flir_camera/image_raw/debayered')]
    )

    input_height = LaunchConfiguration('input_height')
    input_width = LaunchConfiguration('input_width')

    encoder_node = ComposableNode(
        name='encoder',
        package='isaac_ros_h264_encoder',
        plugin='nvidia::isaac_ros::h264_encoder::EncoderNode',
        parameters=[{
                'input_height': input_height,
                'input_width': input_width,
        }],
        remappings=[
            ('image_raw', 'flir_camera/image_raw/debayered'),
            ('image_compressed', 'flir_camera/image_raw/compressed')
        ])

    container = ComposableNodeContainer(
        name='encoder_container',
        namespace='encoder',
        package='rclcpp_components',
        executable='component_container_mt',
        composable_node_descriptions=[flir_camera_node, debayer_node, encoder_node],
        output='screen'
    )

    return (LaunchDescription(launch_args + [container]))

generated YAML graph file:

---
name: XGCUZZGLCC_color_converter
components:
  - name: data_receiver
    type: nvidia::gxf::DoubleBufferReceiver
    parameters:
      capacity: 12
      policy: 0
  - type: nvidia::gxf::MessageAvailableSchedulingTerm
    parameters:
      receiver: data_receiver
      min_size: 1
  - name: data_transmitter
    type: nvidia::gxf::DoubleBufferTransmitter
    parameters:
      capacity: 12
      policy: 0
  - type: nvidia::gxf::DownstreamReceptiveSchedulingTerm
    parameters:
      transmitter: data_transmitter
      min_size: 1
  - name: pool
    type: nvidia::gxf::BlockMemoryPool
    parameters:
      storage_type: 1
      block_size: 6278400
      num_blocks: 40
  - name: color_converter_operator
    type: nvidia::isaac::tensor_ops::StreamConvertColorFormat
    parameters:
      output_type: NV12
      receiver: data_receiver
      transmitter: data_transmitter
      pool: pool
      input_adapter: XGCUZZGLCC_global/adapter
      output_adapter: XGCUZZGLCC_global/adapter
      output_name: image
      stream: XGCUZZGLCC_global/stream
---
name: XGCUZZGLCC_global
components:
  - name: adapter
    type: nvidia::isaac::tensor_ops::ImageAdapter
    parameters:
      message_type: VideoBuffer
  - name: stream
    type: nvidia::isaac::tensor_ops::TensorStream
    parameters:
      backend_type: VPI
      engine_type: GPU
  - name: encoder_context
    type: nvidia::gxf::VideoEncoderContext
    parameters:
      scheduling_term: XGCUZZGLCC_encoder_response/async_st
---
name: XGCUZZGLCC_encoder
components:
  - name: data_receiver
    type: nvidia::gxf::DoubleBufferReceiver
    parameters:
      capacity: 1
      policy: 0
  - type: nvidia::gxf::MessageAvailableSchedulingTerm
    parameters:
      receiver: data_receiver
      min_size: 1
  - name: encoder_request
    type: nvidia::gxf::VideoEncoderRequest
    parameters:
      input_frame: data_receiver
      inbuf_storage_type: 1
      profile: 0
      qp: 20
      hw_preset_type: 0
      input_width: 1408
      input_height: 1024
      input_format: nv12
      config: pframe_cqp
      iframe_interval: 5
      videoencoder_context: XGCUZZGLCC_global/encoder_context
---
name: XGCUZZGLCC_encoder_response
components:
  - name: output
    type: nvidia::gxf::DoubleBufferTransmitter
    parameters:
      capacity: 1
      policy: 0
  - type: nvidia::gxf::DownstreamReceptiveSchedulingTerm
    parameters:
      transmitter: output
      min_size: 1
  - name: pool
    type: nvidia::gxf::BlockMemoryPool
    parameters:
      storage_type: 0
      block_size: 6912000
      num_blocks: 40
  - type: nvidia::gxf::VideoEncoderResponse
    parameters:
      videoencoder_context: XGCUZZGLCC_global/encoder_context
      output_transmitter: output
      outbuf_storage_type: 0
      pool: pool
  - name: async_st
    type: nvidia::gxf::AsynchronousSchedulingTerm
---
name: XGCUZZGLCC_sink
components:
  - name: signal
    type: nvidia::gxf::DoubleBufferReceiver
    parameters:
      capacity: 1
      policy: 0
  - type: nvidia::gxf::MessageAvailableSchedulingTerm
    parameters:
      receiver: signal
      min_size: 1
  - name: sink
    type: nvidia::isaac_ros::MessageRelay
    parameters:
      source: signal
      max_waiting_count: 1
      drop_waiting: false
---
name: XGCUZZGLCC_connections
components:
  - type: nvidia::gxf::Connection
    parameters:
      source: XGCUZZGLCC_color_converter/data_transmitter
      target: XGCUZZGLCC_encoder/data_receiver
  - type: nvidia::gxf::Connection
    parameters:
      source: XGCUZZGLCC_encoder_response/output
      target: XGCUZZGLCC_sink/signal
---
name: XGCUZZGLCC_IAPZHZREND
components:
  - name: clock
    type: nvidia::gxf::RealtimeClock
  - type: nvidia::gxf::MultiThreadScheduler
    parameters:
      clock: clock
      stop_on_deadlock: false
      check_recession_period_ms: 1
      worker_thread_number: 2
  - type: nvidia::gxf::JobStatistics
    parameters:
      clock: clock

I also saw that the encoder was publishing at 20Hz, whereas the input is at 10Hz, is it a normal behaviour? I didn't find a way to decrease the publishing rate.

Also, would you have any documentation on how to display the GXF logs?

And last remark, it would be nice if the input_height and input_width arguments were forwarded to the generated YAML graph, I had to modify the nitros_encoder_node.yaml file to have them taken into account.

hemalshahNV commented 10 months ago

Very strange. You're getting 2 frames out from the encoder for every one you send in? The CPU usage spike is very unexpected. Nothing seems amiss in the launch file. You shouldn't have to change the YAML files as they get modified to match ROS parameters as needed. We'll take a closer look though and see if we can reproduce this issue. Do you have any other information worth relaying? JAO 64GB with a Flir camera encoding Bayer into h.264.

jtaveau commented 9 months ago

Exactly, I don't have anything else to add to that.