ofiwg / libfabric

Open Fabric Interfaces
http://libfabric.org/
Other
530 stars 371 forks source link

prov/psm2: covscan report from redhat #6818

Closed shefty closed 1 year ago

shefty commented 3 years ago
Error: DIVIDE_BY_ZERO (CWE-369): [#def54]
libfabric-1.12.1/prov/psm2/src/psmx2_atomic.c:942: zero_return: Function call "ofi_datatype_size(datatype)" returns 0.
libfabric-1.12.1/prov/psm2/src/psmx2_atomic.c:942: divide_by_zero: In expression "len / ofi_datatype_size(datatype)", division by expression "ofi_datatype_size(datatype)" which may be zero has undefined behavior.
#  940|         psmx2_ioc_read(iov, count, datatype, buf, len);
#  941|   
#  942|->       err = psmx2_atomic_self(PSMX2_AM_REQ_ATOMIC_WRITE, ep_priv,
#  943|                     buf, len / ofi_datatype_size(datatype),
#  944|                     NULL, NULL, NULL, NULL, NULL, addr,

Error: DIVIDE_BY_ZERO (CWE-369): [#def55]
libfabric-1.12.1/prov/psm2/src/psmx2_atomic.c:986: zero_return: Function call "ofi_datatype_size(datatype)" returns 0.
libfabric-1.12.1/prov/psm2/src/psmx2_atomic.c:986: divide_by_zero: In expression "len / ofi_datatype_size(datatype)", division by expression "ofi_datatype_size(datatype)" which may be zero has undefined behavior.
#  984|   
#  985|     args[0].u32w0 = PSMX2_AM_REQ_ATOMIC_WRITE;
#  986|->   args[0].u32w1 = len / ofi_datatype_size(datatype);
#  987|     args[1].u64 = (uint64_t)(uintptr_t)req;
#  988|     args[2].u64 = addr;

Error: FORWARD_NULL (CWE-476): [#def56]
libfabric-1.12.1/prov/psm2/src/psmx2_atomic.c:1241: var_compare_op: Comparing "iov" to null implies that "iov" might be null.
libfabric-1.12.1/prov/psm2/src/psmx2_atomic.c:1252: var_deref_op: Dereferencing null pointer "iov".
# 1250|   
# 1251|     if (op != FI_ATOMIC_READ) {
# 1252|->       buf = iov[0].addr; /* as default for count == 1 */
# 1253|         len = psmx2_ioc_size(iov, count, datatype);
# 1254|         desc0 = desc ? desc[0] : NULL;

Error: DIVIDE_BY_ZERO (CWE-369): [#def57]
libfabric-1.12.1/prov/psm2/src/psmx2_atomic.c:1289: zero_return: Function call "ofi_datatype_size(datatype)" returns 0.
libfabric-1.12.1/prov/psm2/src/psmx2_atomic.c:1289: divide_by_zero: In expression "len / ofi_datatype_size(datatype)", division by expression "ofi_datatype_size(datatype)" which may be zero has undefined behavior.
# 1287|         }
# 1288|   
# 1289|->       err = psmx2_atomic_self(PSMX2_AM_REQ_ATOMIC_READWRITE, ep_priv,
# 1290|                     buf, len / ofi_datatype_size(datatype),
# 1291|                     desc0, NULL, NULL, result, result_desc0,

Error: DIVIDE_BY_ZERO (CWE-369): [#def58]
libfabric-1.12.1/prov/psm2/src/psmx2_atomic.c:1359: zero_return: Function call "ofi_datatype_size(datatype)" returns 0.
libfabric-1.12.1/prov/psm2/src/psmx2_atomic.c:1359: divide_by_zero: In expression "len / ofi_datatype_size(datatype)", division by expression "ofi_datatype_size(datatype)" which may be zero has undefined behavior.
# 1357|   
# 1358|     args[0].u32w0 = PSMX2_AM_REQ_ATOMIC_READWRITE;
# 1359|->   args[0].u32w1 = len / ofi_datatype_size(datatype);
# 1360|     args[1].u64 = (uint64_t)(uintptr_t)req;
# 1361|     args[2].u64 = addr;

Error: DIVIDE_BY_ZERO (CWE-369): [#def59]
libfabric-1.12.1/prov/psm2/src/psmx2_atomic.c:1706: zero_return: Function call "ofi_datatype_size(datatype)" returns 0.
libfabric-1.12.1/prov/psm2/src/psmx2_atomic.c:1706: divide_by_zero: In expression "len / ofi_datatype_size(datatype)", division by expression "ofi_datatype_size(datatype)" which may be zero has undefined behavior.
# 1704|         }
# 1705|   
# 1706|->       err = psmx2_atomic_self(PSMX2_AM_REQ_ATOMIC_COMPWRITE, ep_priv,
# 1707|                     buf, len / ofi_datatype_size(datatype), desc0,
# 1708|                     compare, compare_desc0, result, result_desc0,

Error: DIVIDE_BY_ZERO (CWE-369): [#def60]
libfabric-1.12.1/prov/psm2/src/psmx2_atomic.c:1777: zero_return: Function call "ofi_datatype_size(datatype)" returns 0.
libfabric-1.12.1/prov/psm2/src/psmx2_atomic.c:1777: divide_by_zero: In expression "len / ofi_datatype_size(datatype)", division by expression "ofi_datatype_size(datatype)" which may be zero has undefined behavior.
# 1775|   
# 1776|     args[0].u32w0 = PSMX2_AM_REQ_ATOMIC_COMPWRITE;
# 1777|->   args[0].u32w1 = len / ofi_datatype_size(datatype);
# 1778|     args[1].u64 = (uint64_t)(uintptr_t)req;
# 1779|     args[2].u64 = addr;

Error: CHECKED_RETURN (CWE-252): [#def61]
libfabric-1.12.1/prov/psm2/src/psmx2_attr.c:372: check_return: Calling "asprintf" without checking return value (as is done elsewhere 29 out of 30 times).
libfabric-1.12.1/prov/efa/src/efa_fabric.c:313: example_assign: Example 1: Assigning: "ret" = return value from "asprintf(&device_attr->device_id, "0x%x", efa_device_attr->ibv_attr.vendor_part_id)".
libfabric-1.12.1/prov/efa/src/efa_fabric.c:316: example_checked: Example 1 (cont.): "ret" has its value checked in "ret < 0".
libfabric-1.12.1/prov/mrail/src/mrail_init.c:419: example_checked: Example 2: "asprintf(&s, "%s_%d", mrail_info.fabric_attr->name, id)" has its value checked in "asprintf(&s, "%s_%d", mrail_info.fabric_attr->name, id) < 0".
libfabric-1.12.1/prov/psm2/src/psmx2_init.c:475: example_checked: Example 3: "asprintf(&path, "%s_%d/%s", "/sys/class/infiniband/hfi1", unit, "device")" has its value checked in "asprintf(&path, "%s_%d/%s", "/sys/class/infiniband/hfi1", unit, "device") < 0".
libfabric-1.12.1/prov/psm3/src/psmx3_attr.c:510: example_checked: Example 4: "asprintf(&p->domain_attr->name, "%s", unit_name)" has its value checked in "asprintf(&p->domain_attr->name, "%s", unit_name) < 0".
libfabric-1.12.1/prov/psm3/src/psmx3_init.c:458: example_checked: Example 5: "asprintf(&path, "%s/%s", sys_dev_path, "device")" has its value checked in "asprintf(&path, "%s/%s", sys_dev_path, "device") < 0".
#  370|             p->domain_attr->name = strdup(psmx2_hfi_info.default_domain_name);
#  371|         else
#  372|->           asprintf(&p->domain_attr->name, "hfi1_%d", unit);
#  373|   
#  374|         p->tx_attr->inject_size = psmx2_env.inject_size;

Error: COMPILER_WARNING (CWE-252): [#def62]
libfabric-1.12.1/prov/psm2/src/psmx2_attr.c: scope_hint: In function 'psmx2_update_prov_info'
libfabric-1.12.1/prov/psm2/src/psmx2_attr.c:372:4: warning[-Wunused-result]: ignoring return value of 'asprintf', declared with attribute warn_unused_result
#    asprintf(&p->domain_attr->name, "hfi1_%d", unit);
#    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#  370|             p->domain_attr->name = strdup(psmx2_hfi_info.default_domain_name);
#  371|         else
#  372|->           asprintf(&p->domain_attr->name, "hfi1_%d", unit);
#  373|   
#  374|         p->tx_attr->inject_size = psmx2_env.inject_size;

Error: CLANG_WARNING: [#def63]
libfabric-1.12.1/prov/psm2/src/psmx2_av.c:493:35: warning[unix.Malloc]: Use of zero-allocated memory
#  491|             av_priv->table[idx].valid = 1;
#  492|         }
#  493|->       av_priv->sep_info[idx].ctxt_cnt = 1;
#  494|         av_priv->sep_info[idx].epids = NULL;
#  495|     }

Error: DEADCODE (CWE-561): [#def64]
libfabric-1.12.1/prov/psm2/src/psmx2_av.c:443: assignment: Assigning: "error_count" = "0".
libfabric-1.12.1/prov/psm2/src/psmx2_av.c:510: const: At condition "error_count", the value of "error_count" must be equal to 0.
libfabric-1.12.1/prov/psm2/src/psmx2_av.c:510: dead_error_condition: The condition "error_count" cannot be true.
libfabric-1.12.1/prov/psm2/src/psmx2_av.c:511: dead_error_begin: Execution cannot reach this statement: "for (i = 0; i < count; i++)...".
libfabric-1.12.1/prov/psm2/src/psmx2_av.c:511: effectively_constant: Local variable "error_count" is assigned only once, to a constant value, making it effectively constant throughout its scope. If this is not the intent, examine the logic to see if there is a missing assignment that would make "error_count" not remain constant.
#  509|     if (av_priv->flags & FI_EVENT) {
#  510|         if (error_count) {
#  511|->           for (i = 0; i < count; i++)
#  512|                 psmx2_av_post_completion(av_priv, context, i, errors[i]);
#  513|         }

Error: FORWARD_NULL (CWE-476): [#def65]
libfabric-1.12.1/prov/psm2/src/psmx2_cq.c:1577: var_compare_op: Comparing "buf" to null implies that "buf" might be null.
libfabric-1.12.1/prov/psm2/src/psmx2_cq.c:1641: var_deref_model: Passing null pointer "buf" to "memcpy", which dereferences it. [Note: The source code implementation of the function has been overridden by a builtin model.]
# 1639|                 }
# 1640|   
# 1641|->               memcpy(buf, (void *)&event->cqe, cq_priv->entry_size);
# 1642|                 psmx2_cq_free_event(cq_priv, event);
# 1643|   

Error: CLANG_WARNING: [#def66]
libfabric-1.12.1/prov/psm2/src/psmx2_cq.c:1641:5: warning[core.NonNullParamChecker]: Null pointer passed to 1st parameter expecting 'nonnull'
# 1639|                 }
# 1640|   
# 1641|->               memcpy(buf, (void *)&event->cqe, cq_priv->entry_size);
# 1642|                 psmx2_cq_free_event(cq_priv, event);
# 1643|   

Error: CHECKED_RETURN (CWE-252): [#def67]
libfabric-1.12.1/prov/psm2/src/psmx2_domain.c:555: check_return: Calling "ofi_domain_close" without checking return value (as is done elsewhere 14 out of 17 times).
libfabric-1.12.1/prov/efa/src/efa_domain.c:76: example_assign: Example 1: Assigning: "ret" = return value from "ofi_domain_close(&domain->util_domain)".
libfabric-1.12.1/prov/efa/src/efa_domain.c:77: example_checked: Example 1 (cont.): "ret" has its value checked in "ret".
libfabric-1.12.1/prov/efa/src/rxr/rxr_domain.c:71: example_assign: Example 2: Assigning: "ret" = return value from "ofi_domain_close(&rxr_domain->util_domain)".
libfabric-1.12.1/prov/efa/src/rxr/rxr_domain.c:72: example_checked: Example 2 (cont.): "ret" has its value checked in "ret".
libfabric-1.12.1/prov/mrail/src/mrail_domain.c:49: example_assign: Example 3: Assigning: "ret" = return value from "ofi_domain_close(&mrail_domain->util_domain)".
libfabric-1.12.1/prov/mrail/src/mrail_domain.c:50: example_checked: Example 3 (cont.): "ret" has its value checked in "ret".
libfabric-1.12.1/prov/psm2/src/psmx2_domain.c:187: example_checked: Example 4: "ofi_domain_close(&domain->util_domain)" has its value checked in "ofi_domain_close(&domain->util_domain)".
libfabric-1.12.1/prov/psm3/src/psmx3_domain.c:187: example_checked: Example 5: "ofi_domain_close(&domain->util_domain)" has its value checked in "ofi_domain_close(&domain->util_domain)".
#  553|   
#  554|   err_out_close_domain:
#  555|->   ofi_domain_close(&domain_priv->util_domain);
#  556|   
#  557|   err_out_free_domain:

Error: OVERRUN (CWE-119): [#def68]
libfabric-1.12.1/prov/psm2/src/psmx2_init.c:481: identity_transfer: Passing "80UL" as argument 3 to function "readlink", which returns that argument. [Note: The source code implementation of the function has been overridden by a builtin model.]
libfabric-1.12.1/prov/psm2/src/psmx2_init.c:481: assignment: Assigning: "n" = "readlink(path, buffer, 80UL)". The value of "n" is now 80.
libfabric-1.12.1/prov/psm2/src/psmx2_init.c:490: overrun-local: Overrunning array "buffer" of 80 bytes at byte offset 80 using index "n" (which evaluates to 80).
#  488|         }
#  489|   
#  490|->       buffer[n] = '\0';
#  491|         if ((s = strrchr(buffer, '/')))
#  492|             s++;

Error: READLINK (CWE-170): [#def69]
libfabric-1.12.1/prov/psm2/src/psmx2_init.c:481: readlink_call: Passing size argument "80UL" implies readlink() can return up to "80UL" bytes.
libfabric-1.12.1/prov/psm2/src/psmx2_init.c:481: readlink_assign: Assigning: "n" = readlink().
libfabric-1.12.1/prov/psm2/src/psmx2_init.c:490: readlink: "buffer[n]" is essentially buffer[sizeof(buffer)] which is an off-by-one error.
#  488|         }
#  489|   
#  490|->       buffer[n] = '\0';
#  491|         if ((s = strrchr(buffer, '/')))
#  492|             s++;

Error: CLANG_WARNING: [#def70]
libfabric-1.12.1/prov/psm2/src/psmx2_msg.c:259:31: warning[core.NullDereference]: Array access (via field 'i') results in a null pointer dereference
#  257|         assert(context);
#  258|         fi_context = context;
#  259|->       PSMX2_CTXT_TYPE(fi_context) = PSMX2_SEND_CONTEXT;
#  260|         PSMX2_CTXT_USER(fi_context) = (void *)buf;
#  261|         PSMX2_CTXT_EP(fi_context) = ep_priv;

Error: CLANG_WARNING: [#def71]
libfabric-1.12.1/prov/psm2/src/psmx2_tagged.c:78:51: warning[core.uninitialized.Assign]: Assigned value is garbage or undefined
#   76|             if (flags & FI_CLAIM) {
#   77|                 if (context)
#   78|->                   PSMX2_CTXT_REQ((struct fi_context *)context) = req;
#   79|             } else if (flags & FI_DISCARD) {
#   80|                 if (!psm2_mq_imrecv(ep_priv->rx->psm2_mq, 0,

Error: CLANG_WARNING: [#def72]
libfabric-1.12.1/prov/psm2/src/psmx2_tagged.c:80:10: warning[core.CallAndMessage]: 5th function call argument is an uninitialized value
#   78|                     PSMX2_CTXT_REQ((struct fi_context *)context) = req;
#   79|             } else if (flags & FI_DISCARD) {
#   80|->               if (!psm2_mq_imrecv(ep_priv->rx->psm2_mq, 0,
#   81|                             NULL, 0, req, &req))
#   82|                     psm2_mq_wait2(&req, NULL);

Error: CLANG_WARNING: [#def73]
libfabric-1.12.1/prov/psm2/src/psmx2_tagged.c:584:31: warning[core.NullDereference]: Array access (via field 'i') results in a null pointer dereference
#  582|         assert(context);
#  583|         fi_context = context;
#  584|->       PSMX2_CTXT_TYPE(fi_context) = PSMX2_TSEND_CONTEXT;
#  585|         PSMX2_CTXT_USER(fi_context) = (void *)buf;
#  586|         PSMX2_CTXT_EP(fi_context) = ep_priv;

Error: CLANG_WARNING: [#def74]
libfabric-1.12.1/prov/psm2/src/psmx2_tagged.c:34: included_from: Included from here.
libfabric-1.12.1/prov/psm2/src/psmx2_trigger.h:598:31: warning[core.NullDereference]: Dereference of null pointer
#  596|   
#  597|     trigger->op = PSMX2_TRIGGERED_TSEND;
#  598|->   trigger->cntr = container_of(ctxt->trigger.threshold.cntr,
#  599|                      struct psmx2_fid_cntr, cntr);
#  600|     trigger->threshold = ctxt->trigger.threshold.threshold;

Error: CLANG_WARNING: [#def75]
libfabric-1.12.1/prov/psm2/src/psmx2_msg.c:34: included_from: Included from here.
libfabric-1.12.1/prov/psm2/src/psmx2_trigger.h:688:31: warning[core.NullDereference]: Dereference of null pointer
#  686|   
#  687|     trigger->op = PSMX2_TRIGGERED_SEND;
#  688|->   trigger->cntr = container_of(ctxt->trigger.threshold.cntr,
#  689|                      struct psmx2_fid_cntr, cntr);
#  690|     trigger->threshold = ctxt->trigger.threshold.threshold;

Error: CLANG_WARNING: [#def76]
libfabric-1.12.1/prov/psm2/src/psmx2_rma.c:34: included_from: Included from here.
libfabric-1.12.1/prov/psm2/src/psmx2_trigger.h:809:31: warning[core.NullDereference]: Dereference of null pointer
#  807|   
#  808|     trigger->op = PSMX2_TRIGGERED_WRITE;
#  809|->   trigger->cntr = container_of(ctxt->trigger.threshold.cntr,
#  810|                      struct psmx2_fid_cntr, cntr);
#  811|     trigger->threshold = ctxt->trigger.threshold.threshold;

Error: CHECKED_RETURN (CWE-252): [#def77]
libfabric-1.12.1/prov/psm2/src/psmx2_trx_ctxt.c:111: check_return: Calling "pthread_create" without checking return value (as is done elsewhere 15 out of 18 times).
libfabric-1.12.1/prov/psm2/src/psmx2_domain.c:141: example_assign: Example 1: Assigning: "err" = return value from "pthread_create(&domain->progress_thread, NULL, psmx2_progress_func, (void *)domain)".
libfabric-1.12.1/prov/psm2/src/psmx2_domain.c:143: example_checked: Example 1 (cont.): "err" has its value checked in "err".
libfabric-1.12.1/prov/psm2/src/psmx2_wait.c:118: example_assign: Example 2: Assigning: "err" = return value from "pthread_create(&psmx2_wait_thread, &attr, psmx2_wait_progress, (void *)fabric)".
libfabric-1.12.1/prov/psm2/src/psmx2_wait.c:120: example_checked: Example 2 (cont.): "err" has its value checked in "err".
libfabric-1.12.1/prov/psm3/psm3/psm_mq.c:1413: example_checked: Example 3: "pthread_create(&mq->mq_perf_data.perf_print_thread, NULL, psmi_mq_print_stats_thread, (void *)mq)" has its value checked in "pthread_create(&mq->mq_perf_data.perf_print_thread, NULL, psmi_mq_print_stats_thread, (void *)mq)".
libfabric-1.12.1/prov/psm3/psm3/psm_stats.c:289: example_checked: Example 4: "pthread_create(&perf_print_thread, NULL, psmi_print_stats_thread, NULL)" has its value checked in "pthread_create(&perf_print_thread, NULL, psmi_print_stats_thread, NULL)".
libfabric-1.12.1/prov/psm3/psm3/ptl_ips/ptl_rcvthread.c:145: example_checked: Example 5: "pthread_create(&rcvc->hdrq_threadid, NULL, ips_ptl_pollintr, ptl->rcvthread)" has its value checked in "pthread_create(&rcvc->hdrq_threadid, NULL, ips_ptl_pollintr, ptl->rcvthread)".
#  109|             disconn->trx_ctxt = trx_ctxt;
#  110|             disconn->epaddr = epaddr;
#  111|->           pthread_create(&disconnect_thread, NULL,
#  112|                        disconnect_func, disconn);
#  113|             pthread_detach(disconnect_thread);
mwheinz commented 3 years ago

Sean, how high a priority is this? I'm assuming this is a new static code checker that RHEL is using?

shefty commented 3 years ago

I wouldn't consider it a high-priority myself. It may be worth scanning over and seeing if any significant issues are being identified.

I don't have details on the checker, but I assume it's basically the same as what we have with coverity. Only with coverity, we've already analyzed false positives and marked them as such. There are only a couple open coverity issues.

acgoldma commented 3 years ago

From what I have been able to find, covscan is the tool used to preform the scan which is part of the Coverity Scan's Tool Suite. That being said, the coverity scan for ofiwg/libfabric does not appear to be scanning all providers:

See all the providers with "0" lines of code...

Analysis Metrics per Components

Component Name | Pattern | Ignore | Line of Code | Defect density -- | -- | -- | -- | -- Core | (/src/.*\|/include/.*) | No | 16,870 | 0.00 GNI provider | /prov/gni/.* | No | 0 | N/A PSM provider | /prov/psm/.* | No | 0 | N/A PSM2 provider | /prov/psm2/.* | No | 0 | N/A RxD provider | /prov/rxd/.* | No | 4,505 | 0.00 RxM provider | /prov/rxm/.* | No | 8,094 | 0.25 sockets provider | /prov/sockets/.* | No | 12,039 | 0.00 UDP provider | /prov/udp/.* | No | 957 | 0.00 usNIC provider | /prov/usnic/.* | No | 0 | N/A util provider | /prov/util/.* | No | 9,411 | 0.11 Verbs provider | /prov/verbs/.* | No | 0 | N/A ND provider | /prov/netdir/.* | No | 0 | N/A SHM provider | /prov/shm/.* | No | 4,242 | 0.00 fabtests | /fabtests/.* | No | 0 | N/A TCP provider | /prov/tcp/* | No | 3,670 | 0.00 EFA provider | (/prov/efa/src/.*\|/prov/efa/src/rxr/.*) | No | 0 | N/A PSM3 provider | /prov/psm3/.* | No | 0 | N/A Other | .* | No | 33,721 | 0.00
shefty commented 3 years ago

Correct - we've only enabled the scan for some providers. We can add more if the maintainers will respond to the issues, and coverity can build it..

rajachan commented 3 years ago

Hmm, that's odd. When I added the Travis config, I thought I enabled it for all providers. Where is the opt-in list? The analysis settings isn't ignoring any component from what I can tell.

shefty commented 3 years ago

Can coverity build the other providers? E.g. usnic, verbs, psm/2 require external libraries.

rajachan commented 3 years ago

We build in the Travis environment and just push results to Coverity. The Travis environment pulls in rdma-core and libnl, so in principle verbs/EFA/usnic should at least compile: https://github.com/ofiwg/libfabric/blob/main/.travis.yml#L10 https://github.com/ofiwg/libfabric/commit/b083456944f020a0d81241144b5dc381911afe8c

Maybe configure is silently dropping the providers, given the coverity build uses vanilla configure options. I can take a closer look later today.

shefty commented 3 years ago

Thanks - I'm trying to figure this out too. I would have thought the other providers would have at least built and have a valid report. I don't think psm3 requires external libraries.

rajachan commented 3 years ago

Like I mentioned in the call today, I couldn't figure out why some of them are not getting scanned. Looking at the list of providers that are being skipped, my suspicion is still that the configure is silently disabling them, so there's nothing to feed into the static analyzer. I'll send up a PR to explicitly enable providers that we can build in Travis and see if that does it. Unfortunately, we would have to merge that to main before we can see results.

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 360 days with no activity. Remove stale label or comment, otherwise it will be closed in 7 days.

timothom64 commented 2 years ago

I just checked the last covarity can here: https://scan.coverity.com/projects/ofiwg-libfabric

PSM2 is not being scanned, Opx is. There isn't much development going on with PSM2 atm, but eventually we should add it to things being scanned, as it will still be supported for some time.

shefty commented 2 years ago

I can add it to the scan if someone will focus on fixing the issues.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 360 days with no activity. Remove stale label or comment, otherwise it will be closed in 7 days.