IBM / CAST

CAST can enhance the system management of cluster-wide resources. It consists of the open source tools: cluster system management (CSM) and burst buffer.
Eclipse Public License 1.0
27 stars 34 forks source link

CSM RHEL8: fix GCC strncpy warnings [-Wstringop-truncation] in CSM GPU inventory #952

Closed besawn closed 4 years ago

besawn commented 4 years ago

When building CSM on RHEL 8, the following warnings are reported that are not reported when building on RHEL 7:

/u/besawn/github/CAST.besawn.RHEL8/csmd/src/inv/src/inv_gpu_inventory.cc:157:16: warning: \u2018char* strncpy(char*, const char*, size_t)\u2019 specified bound 32 equals destination size [-Wstringop-truncation]
/u/besawn/github/CAST.besawn.RHEL8/csmd/src/inv/src/inv_gpu_inventory.cc:160:16: warning: \u2018char* strncpy(char*, const char*, size_t)\u2019 specified bound 32 equals destination size [-Wstringop-truncation]
/u/besawn/github/CAST.besawn.RHEL8/csmd/src/inv/src/inv_gpu_inventory.cc:163:16: warning: \u2018char* strncpy(char*, const char*, size_t)\u2019 specified bound 32 equals destination size [-Wstringop-truncation]
/u/besawn/github/CAST.besawn.RHEL8/csmd/src/inv/src/inv_gpu_inventory.cc:166:16: warning: \u2018char* strncpy(char*, const char*, size_t)\u2019 specified bound 64 equals destination size [-Wstringop-truncation]
/u/besawn/github/CAST.besawn.RHEL8/csmd/src/inv/src/inv_gpu_inventory.cc:169:16: warning: \u2018char* strncpy(char*, const char*, size_t)\u2019 specified bound 32 equals destination size [-Wstringop-truncation]
/u/besawn/github/CAST.besawn.RHEL8/csmd/src/inv/src/inv_gpu_inventory.cc:172:16: warning: \u2018char* strncpy(char*, const char*, size_t)\u2019 specified bound 32 equals destination size [-Wstringop-truncation]

After investigating, these warnings appear to be safe to ignore. The warning is to let us know that we may be leaving the destination memory region without a NULL terminator in the case where the source string is larger than the destination. However, we explicitly add a NULL terminator immediately following the calls to strncpy to handle this scenario, like this:

        strncpy(gpu_inventory[gpu_count].device_name, gpu_attributes[i].identifiers.deviceName, CSM_GPU_DEVICE_NAME_MAX);
        gpu_inventory[gpu_count].device_name[CSM_GPU_DEVICE_NAME_MAX-1] = '\0';

This same logic is used in other areas of the code without generating the warning above, but there seem to be some subtle differences related to whether the source is a char * or a char [], which may be a factor that influences whether the warning is triggered.

Reducing the length of the strncpy by 1 and continuing to explicitly set the last char to a NULL causes the warning to stop being reported, so I have updated the code to use this logic. These are functionally equivalent as long as the explicit NULL terminator is added, which is needed in any case to handle the case of truncation anway.

I then tested to make sure no truncation or other differences were detected in the collected inventory information and all of the data was the same:

(Note: the purpose of cat -A was to add non-printable characters to the output to make sure there weren't any hidden differences being inserted accidentally. This causes the $ chars to appear at the end of each line, for example.)

Before the change:

[root@c650f99p36 ~]# /opt/ibm/csm/bin/csm_node_attributes_query_details -n c650f99p36 | grep -A 80 gpus_count | grep -B 80 hcas_count | cat -A | tee /tmp/gpu_inventory.before
  gpus_count:                   6$
  gpus:$
    - gpu_id:                0$
      device_name:           Tesla V100-SXM2-16GB$
      hbm_memory:            16160$
      inforom_image_version: G503.0201.00.03$
      pci_bus_id:            00000004:04:00.0$
      serial_number:         0321918195569$
      uuid:                  GPU-e2c17ed0-cabd-05bf-97ba-1fb231cfc72a$
      vbios:                 88.00.13.00.02$
    - gpu_id:                1$
      device_name:           Tesla V100-SXM2-16GB$
      hbm_memory:            16160$
      inforom_image_version: G503.0201.00.03$
      pci_bus_id:            00000004:05:00.0$
      serial_number:         0321918195593$
      uuid:                  GPU-8eb9e0b0-68b4-af0a-1621-38a5cc824b63$
      vbios:                 88.00.13.00.02$
    - gpu_id:                2$
      device_name:           Tesla V100-SXM2-16GB$
      hbm_memory:            16160$
      inforom_image_version: G503.0201.00.03$
      pci_bus_id:            00000004:06:00.0$
      serial_number:         0321918195392$
      uuid:                  GPU-21b46764-8625-e9a9-7839-3ac4b05d267f$
      vbios:                 88.00.13.00.02$
    - gpu_id:                3$
      device_name:           Tesla V100-SXM2-16GB$
      hbm_memory:            16160$
      inforom_image_version: G503.0201.00.03$
      pci_bus_id:            00000035:03:00.0$
      serial_number:         0321918195373$
      uuid:                  GPU-7ad41573-9b0a-42b7-3a40-af1bbbc7788b$
      vbios:                 88.00.13.00.02$
    - gpu_id:                4$
      device_name:           Tesla V100-SXM2-16GB$
      hbm_memory:            16160$
      inforom_image_version: G503.0201.00.03$
      pci_bus_id:            00000035:04:00.0$
      serial_number:         0321918195862$
      uuid:                  GPU-7dd463e6-38b3-15cc-9f7f-d8fe7627330d$
      vbios:                 88.00.13.00.02$
    - gpu_id:                5$
      device_name:           Tesla V100-SXM2-16GB$
      hbm_memory:            16160$
      inforom_image_version: G503.0201.00.03$
      pci_bus_id:            00000035:05:00.0$
      serial_number:         0321918195734$
      uuid:                  GPU-1c19851d-e578-1b8e-0cdb-999d45004922$
      vbios:                 88.00.13.00.02$
  hcas_count:                   1$

After the change:

[root@c650f99p36 ~]# /opt/ibm/csm/bin/csm_node_attributes_query_details -n c650f99p36 | grep -A 80 gpus_count | grep -B 80 hcas_count | cat -A | tee /tmp/gpu_inventory.after
  gpus_count:                   6$
  gpus:$
    - gpu_id:                0$
      device_name:           Tesla V100-SXM2-16GB$
      hbm_memory:            16160$
      inforom_image_version: G503.0201.00.03$
      pci_bus_id:            00000004:04:00.0$
      serial_number:         0321918195569$
      uuid:                  GPU-e2c17ed0-cabd-05bf-97ba-1fb231cfc72a$
      vbios:                 88.00.13.00.02$
    - gpu_id:                1$
      device_name:           Tesla V100-SXM2-16GB$
      hbm_memory:            16160$
      inforom_image_version: G503.0201.00.03$
      pci_bus_id:            00000004:05:00.0$
      serial_number:         0321918195593$
      uuid:                  GPU-8eb9e0b0-68b4-af0a-1621-38a5cc824b63$
      vbios:                 88.00.13.00.02$
    - gpu_id:                2$
      device_name:           Tesla V100-SXM2-16GB$
      hbm_memory:            16160$
      inforom_image_version: G503.0201.00.03$
      pci_bus_id:            00000004:06:00.0$
      serial_number:         0321918195392$
      uuid:                  GPU-21b46764-8625-e9a9-7839-3ac4b05d267f$
      vbios:                 88.00.13.00.02$
    - gpu_id:                3$
      device_name:           Tesla V100-SXM2-16GB$
      hbm_memory:            16160$
      inforom_image_version: G503.0201.00.03$
      pci_bus_id:            00000035:03:00.0$
      serial_number:         0321918195373$
      uuid:                  GPU-7ad41573-9b0a-42b7-3a40-af1bbbc7788b$
      vbios:                 88.00.13.00.02$
    - gpu_id:                4$
      device_name:           Tesla V100-SXM2-16GB$
      hbm_memory:            16160$
      inforom_image_version: G503.0201.00.03$
      pci_bus_id:            00000035:04:00.0$
      serial_number:         0321918195862$
      uuid:                  GPU-7dd463e6-38b3-15cc-9f7f-d8fe7627330d$
      vbios:                 88.00.13.00.02$
    - gpu_id:                5$
      device_name:           Tesla V100-SXM2-16GB$
      hbm_memory:            16160$
      inforom_image_version: G503.0201.00.03$
      pci_bus_id:            00000035:05:00.0$
      serial_number:         0321918195734$
      uuid:                  GPU-1c19851d-e578-1b8e-0cdb-999d45004922$
      vbios:                 88.00.13.00.02$
  hcas_count:                   1$

[root@c650f99p36 ~]# diff /tmp/gpu_inventory.before /tmp/gpu_inventory.after