High CPU load while encoding

Hi, I tried running the encoder on a Jetson AGX Orin 64GB, it's working, but CPU load is super high, am I doing something wrong with my config file? I compiled with -DCMAKE_BUILD_TYPE=Release.

Camera + debayering is taking around 15%, if I add the encoding to the component container, it goes to 160% on htop.

Launch file:

# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
#

from launch_ros.substitutions import FindPackageShare
from launch_ros.actions import ComposableNodeContainer
from launch.substitutions import LaunchConfiguration
from launch.substitutions import PathJoinSubstitution
from launch.actions import DeclareLaunchArgument
from launch import LaunchDescription
from launch_ros.descriptions import ComposableNode

example_parameters = {
    'blackfly_s': {
        'debug': False,
        'compute_brightness': False,
        'adjust_timestamp': True,
        'dump_node_map': False,
        # set parameters defined in blackfly_s.yaml
        'gain_auto': 'Continuous',
        'pixel_format': 'BayerRG8', #BGR8
        'exposure_auto': 'Continuous',
        # These are useful for GigE cameras
        # 'device_link_throughput_limit': 380000000,
        # 'gev_scps_packet_size': 9000,
        # ---- to reduce the sensor width and shift the crop
        'image_width': 1408,
        'image_height': 1024,
        # 'offset_x': 16,
        # 'offset_y': 0,
        'frame_rate_auto': 'Off',
        'frame_rate': 10.0,
        'frame_rate_enable': True,
        'buffer_queue_size': 1,
        'trigger_mode': 'Off',
        'chunk_mode_active': True,
        'chunk_selector_frame_id': 'FrameID',
        'chunk_enable_frame_id': True,
        'chunk_selector_exposure_time': 'ExposureTime',
        'chunk_enable_exposure_time': True,
        'chunk_selector_gain': 'Gain',
        'chunk_enable_gain': True,
        'chunk_selector_timestamp': 'Timestamp',
        'chunk_enable_timestamp': True}
}

def generate_launch_description():
    launch_args = [
        DeclareLaunchArgument(
            'camera_name', 
            default_value=['flir_camera'],
            description='camera name (ros node name)'),
        DeclareLaunchArgument(
            'serial', 
            default_value="'20435008'",
            description='FLIR serial number of camera (in quotes!!)'),
        DeclareLaunchArgument(
            'input_height',
            default_value='1024',
            description='Height of the original image'),
        DeclareLaunchArgument(
            'input_width',
            default_value='1408',
            description='Width of the original image'),
    ]

    camera_name = LaunchConfiguration('camera_name')
    serial = LaunchConfiguration('serial')
    parameter_file = ''
    camera_type = 'blackfly_s'

    if not parameter_file:
        parameter_file = PathJoinSubstitution(
            [FindPackageShare('spinnaker_camera_driver'), 'config',
             camera_type + '.yaml'])

    flir_camera_node = ComposableNode(
        name=[camera_name],
        package='spinnaker_camera_driver',
        plugin='spinnaker_camera_driver::CameraDriver',
        parameters=[example_parameters[camera_type],
                    {'parameter_file': parameter_file,
                     'serial_number': serial}],
        remappings=[('~/control', '/exposure_control/control'), ]
    )

    debayer_node = ComposableNode(
        name='debayer',
        package='image_proc',
        plugin='image_proc::DebayerNode',
        remappings=[
            ('image_raw', 'flir_camera/image_raw'),
            ('image_color', 'flir_camera/image_raw/debayered')]
    )

    input_height = LaunchConfiguration('input_height')
    input_width = LaunchConfiguration('input_width')

    encoder_node = ComposableNode(
        name='encoder',
        package='isaac_ros_h264_encoder',
        plugin='nvidia::isaac_ros::h264_encoder::EncoderNode',
        parameters=[{
                'input_height': input_height,
                'input_width': input_width,
        }],
        remappings=[
            ('image_raw', 'flir_camera/image_raw/debayered'),
            ('image_compressed', 'flir_camera/image_raw/compressed')
        ])

    container = ComposableNodeContainer(
        name='encoder_container',
        namespace='encoder',
        package='rclcpp_components',
        executable='component_container_mt',
        composable_node_descriptions=[flir_camera_node, debayer_node, encoder_node],
        output='screen'
    )

    return (LaunchDescription(launch_args + [container]))

generated YAML graph file:

---
name: XGCUZZGLCC_color_converter
components:
  - name: data_receiver
    type: nvidia::gxf::DoubleBufferReceiver
    parameters:
      capacity: 12
      policy: 0
  - type: nvidia::gxf::MessageAvailableSchedulingTerm
    parameters:
      receiver: data_receiver
      min_size: 1
  - name: data_transmitter
    type: nvidia::gxf::DoubleBufferTransmitter
    parameters:
      capacity: 12
      policy: 0
  - type: nvidia::gxf::DownstreamReceptiveSchedulingTerm
    parameters:
      transmitter: data_transmitter
      min_size: 1
  - name: pool
    type: nvidia::gxf::BlockMemoryPool
    parameters:
      storage_type: 1
      block_size: 6278400
      num_blocks: 40
  - name: color_converter_operator
    type: nvidia::isaac::tensor_ops::StreamConvertColorFormat
    parameters:
      output_type: NV12
      receiver: data_receiver
      transmitter: data_transmitter
      pool: pool
      input_adapter: XGCUZZGLCC_global/adapter
      output_adapter: XGCUZZGLCC_global/adapter
      output_name: image
      stream: XGCUZZGLCC_global/stream
---
name: XGCUZZGLCC_global
components:
  - name: adapter
    type: nvidia::isaac::tensor_ops::ImageAdapter
    parameters:
      message_type: VideoBuffer
  - name: stream
    type: nvidia::isaac::tensor_ops::TensorStream
    parameters:
      backend_type: VPI
      engine_type: GPU
  - name: encoder_context
    type: nvidia::gxf::VideoEncoderContext
    parameters:
      scheduling_term: XGCUZZGLCC_encoder_response/async_st
---
name: XGCUZZGLCC_encoder
components:
  - name: data_receiver
    type: nvidia::gxf::DoubleBufferReceiver
    parameters:
      capacity: 1
      policy: 0
  - type: nvidia::gxf::MessageAvailableSchedulingTerm
    parameters:
      receiver: data_receiver
      min_size: 1
  - name: encoder_request
    type: nvidia::gxf::VideoEncoderRequest
    parameters:
      input_frame: data_receiver
      inbuf_storage_type: 1
      profile: 0
      qp: 20
      hw_preset_type: 0
      input_width: 1408
      input_height: 1024
      input_format: nv12
      config: pframe_cqp
      iframe_interval: 5
      videoencoder_context: XGCUZZGLCC_global/encoder_context
---
name: XGCUZZGLCC_encoder_response
components:
  - name: output
    type: nvidia::gxf::DoubleBufferTransmitter
    parameters:
      capacity: 1
      policy: 0
  - type: nvidia::gxf::DownstreamReceptiveSchedulingTerm
    parameters:
      transmitter: output
      min_size: 1
  - name: pool
    type: nvidia::gxf::BlockMemoryPool
    parameters:
      storage_type: 0
      block_size: 6912000
      num_blocks: 40
  - type: nvidia::gxf::VideoEncoderResponse
    parameters:
      videoencoder_context: XGCUZZGLCC_global/encoder_context
      output_transmitter: output
      outbuf_storage_type: 0
      pool: pool
  - name: async_st
    type: nvidia::gxf::AsynchronousSchedulingTerm
---
name: XGCUZZGLCC_sink
components:
  - name: signal
    type: nvidia::gxf::DoubleBufferReceiver
    parameters:
      capacity: 1
      policy: 0
  - type: nvidia::gxf::MessageAvailableSchedulingTerm
    parameters:
      receiver: signal
      min_size: 1
  - name: sink
    type: nvidia::isaac_ros::MessageRelay
    parameters:
      source: signal
      max_waiting_count: 1
      drop_waiting: false
---
name: XGCUZZGLCC_connections
components:
  - type: nvidia::gxf::Connection
    parameters:
      source: XGCUZZGLCC_color_converter/data_transmitter
      target: XGCUZZGLCC_encoder/data_receiver
  - type: nvidia::gxf::Connection
    parameters:
      source: XGCUZZGLCC_encoder_response/output
      target: XGCUZZGLCC_sink/signal
---
name: XGCUZZGLCC_IAPZHZREND
components:
  - name: clock
    type: nvidia::gxf::RealtimeClock
  - type: nvidia::gxf::MultiThreadScheduler
    parameters:
      clock: clock
      stop_on_deadlock: false
      check_recession_period_ms: 1
      worker_thread_number: 2
  - type: nvidia::gxf::JobStatistics
    parameters:
      clock: clock

I also saw that the encoder was publishing at 20Hz, whereas the input is at 10Hz, is it a normal behaviour? I didn't find a way to decrease the publishing rate.

Also, would you have any documentation on how to display the GXF logs?

And last remark, it would be nice if the input_height and input_width arguments were forwarded to the generated YAML graph, I had to modify the nitros_encoder_node.yaml file to have them taken into account.

NVIDIA-ISAAC-ROS / isaac_ros_compression

High CPU load while encoding #8