PCRE2Project / pcre2

PCRE2 development is now based here.
Other
921 stars 194 forks source link

Coverity defect: Illegal address computation #423

Closed CrypticEcho closed 5 months ago

CrypticEcho commented 5 months ago

PCRE2 version: 10.43 Our Coverity scan has reported a defect in PCRE2 Illegal address computation

If this address is later used for bounds checking another pointer before dereferencing, an overrun may occur due to the weak guard.

In pcre2_compile_8: An illegal address is computed, which either precedes a buffer or is more than just-past its end (CWE-119)

Specific details: pcre2_com[pile.c


10967 /* Errors discovered in parse_regex() set the offset value in the compile
10968 block. Errors discovered before it is called must compute it from the ptr
10969 value. After parse_regex() is called, the offset in the compile block is set to
10970 the end of the pattern, but certain errors in compile_regex() may reset it if
10971 an offset is available in the parsed pattern. */
10972
10973 HAD_CB_ERROR:

_**CID 1066215: (#5 of 5): Illegal address computation (OVERRUN)

  1. illegaladdress: pattern + cb.erroroffset evaluates to an address that is at byte offset 4294967295 of an array of 1 bytes.**
    10974ptr = pattern + cb.erroroffset;
    10975
    10976HAD_EARLY_ERROR:
    10977*erroroffset = ptr - pattern;
    10978
    10979HAD_ERROR:
    10980*errorptr = errorcode;
    10981pcre2_code_free(re);
    10982re = NULL;
        45. Jumping to label EXIT.
    10983goto EXIT;
    10984}
carenas commented 5 months ago

illegal_address: pattern + cb.erroroffset evaluates to an address that is at byte offset 4294967295 of an array of 1 bytes.

this seems suspiciously more like a bug in Coverity as cb.erroroffset == UINT_MAX and pattern is usually longer than 1 byte, specially if parse_regex() returned an error with that offset.

maybe it would be a good idea to trace all previous 43 steps to figure out where the logic went wrong.

CrypticEcho commented 5 months ago

It is not uncommon for Coverity to report false positives.

Here is some additional context: PCRE2_CODE_UNIT_WIDTH=8

pcre2_compile.c: ...

10103
10104 /*************************************************
10105 *     External function to compile a pattern     *
10106 *************************************************/
10107
10108 /* This function reads a regular expression in the form of a string and returns
10109 a pointer to a block of store holding a compiled version of the expression.
10110
10111 Arguments:
10112  pattern       the regular expression
10113  patlen        the length of the pattern, or PCRE2_ZERO_TERMINATED
10114  options       option bits
10115  errorptr      pointer to errorcode
10116  erroroffset   pointer to error offset
10117  ccontext      points to a compile context or is NULL
10118
10119 Returns:        pointer to compiled data block, or NULL on error,
10120                with errorcode and erroroffset set
10121 */
10122
10123 PCRE2_EXP_DEFN pcre2_code * PCRE2_CALL_CONVENTION
10124 pcre2_compile(PCRE2_SPTR pattern, PCRE2_SIZE patlen, uint32_t options,
10125  int *errorptr, PCRE2_SIZE *erroroffset, pcre2_compile_context *ccontext)
10126 {
10127 BOOL utf;                             /* Set TRUE for UTF mode */
10128 BOOL ucp;                             /* Set TRUE for UCP mode */
10129 BOOL has_lookbehind = FALSE;          /* Set TRUE if a lookbehind is found */
10130 BOOL zero_terminated;                 /* Set TRUE for zero-terminated pattern */
10131 pcre2_real_code *re = NULL;           /* What we will return */
10132 compile_block cb;                     /* "Static" compile-time data */
10133 const uint8_t *tables;                /* Char tables base pointer */
10134 
10135 PCRE2_UCHAR *code;                    /* Current pointer in compiled code */
10136 PCRE2_SPTR codestart;                 /* Start of compiled code */
10137 PCRE2_SPTR ptr;                       /* Current pointer in pattern */
10138 uint32_t *pptr;                       /* Current pointer in parsed pattern */
10139 
10140 PCRE2_SIZE length = 1;                /* Allow for final END opcode */
10141 PCRE2_SIZE usedlength;                /* Actual length used */
10142 PCRE2_SIZE re_blocksize;              /* Size of memory block */
10143 PCRE2_SIZE big32count = 0;            /* 32-bit literals >= 0x80000000 */
10144 PCRE2_SIZE parsed_size_needed;        /* Needed for parsed pattern */
10145
10146 uint32_t firstcuflags, reqcuflags;    /* Type of first/req code unit */
10147 uint32_t firstcu, reqcu;              /* Value of first/req code unit */
10148 uint32_t setflags = 0;                /* NL and BSR set flags */
10149
10150 uint32_t skipatstart;                 /* When checking (*UTF) etc */
10151 uint32_t limit_heap  = UINT32_MAX;
10152 uint32_t limit_match = UINT32_MAX;    /* Unset match limits */
10153 uint32_t limit_depth = UINT32_MAX;
10154
10155 int newline = 0;                      /* Unset; can be set by the pattern */
10156 int bsr = 0;                          /* Unset; can be set by the pattern */
10157 int errorcode = 0;                    /* Initialize to avoid compiler warn */
10158 int regexrc;                          /* Return from compile */
10159
10160 uint32_t i;                           /* Local loop counter */
10161
10162 /* Comments at the head of this file explain about these variables. */
10163
10164 uint32_t stack_groupinfo[GROUPINFO_DEFAULT_SIZE];
10165 uint32_t stack_parsed_pattern[PARSED_PATTERN_DEFAULT_SIZE];
10166 named_group named_groups[NAMED_GROUP_LIST_SIZE];
10167
10168 /* The workspace is used in different ways in the different compiling phases.
10169 It needs to be 16-bit aligned for the preliminary parsing scan. */
10170
10171 uint32_t c16workspace[C16_WORK_SIZE];
10172 PCRE2_UCHAR *cworkspace = (PCRE2_UCHAR *)c16workspace;
10173
10174
10175 /* -------------- Check arguments and set up the pattern ----------------- */
10176
10177 /* There must be error code and offset pointers. */
10178
    1. Condition errorptr == NULL, taking false branch.
    2. Condition erroroffset == NULL, taking false branch.
10179 if (errorptr == NULL || erroroffset == NULL) return NULL;
10180 *errorptr = ERR0;
10181 *erroroffset = 0;
10182
10183 /* There must be a pattern, but NULL is allowed with zero length. */
10184
    3. Condition pattern == NULL, taking true branch.
10185 if (pattern == NULL)
10186  {
    4. Condition patlen == 0, taking true branch.
    5. alias: Assigning: pattern = "". pattern now points to byte 0 of "" (which consists of 1 bytes).
    6. Falling through to end of if statement.
10187  if (patlen == 0) pattern = (PCRE2_SPTR)""; else
10188    {
10189    *errorptr = ERR16;
10190    return NULL;
10191    }
10192  }
10193
10194 /* A NULL compile context means "use a default context" */
10195
    7. Condition ccontext == NULL, taking true branch.
10196 if (ccontext == NULL)
10197  ccontext = (pcre2_compile_context *)(&PRIV(default_compile_context));
10198
10199 /* PCRE2_MATCH_INVALID_UTF implies UTF */
10200
    8. Condition (options & 0x4000000U) != 0, taking true branch.
10201 if ((options & PCRE2_MATCH_INVALID_UTF) != 0) options |= PCRE2_UTF;
10202
10203 /* Check that all undefined public option bits are zero. */
10204
    9. Condition (options & 402653184U /* ~(((((((((((((((((((((((((((((0x80000000U | 4U) | 8U) | 0x20000000U) | 0x100U) | 0x2000000U) | 0x4000000U) | 0x10000U) | 0x40000000U) | 0x800000U) | 0x80000U) | 1U) | 2U) | 0x200000U) | 0x400000U) | 0x10U) | 0x20U) | 0x40U) | 0x80U) | 0x1000000U) | 0x200U) | 0x400U) | 0x100000U) | 0x800U) | 0x1000U) | 0x2000U) | 0x4000U) | 0x8000U) | 0x20000U) | 0x40000U) */) != 0, taking false branch.
    10. Condition (ccontext->extra_options & 4294959104U /* ~((((((((((((8U | 4U) | 0x80U) | 1U) | 2U) | 0x10U) | 0x20U) | 0x40U) | 0x100U ) | 0x200U) | 0x400U) | 0x800U) | 0x1000U) */) != 0, taking false branch.
10205 if ((options & ~PUBLIC_COMPILE_OPTIONS) != 0 ||
10206    (ccontext->extra_options & ~PUBLIC_COMPILE_EXTRA_OPTIONS) != 0)
10207  {
10208  *errorptr = ERR17;
10209  return NULL;
10210  }
10211
    11. Condition (options & 0x2000000U) != 0, taking true branch.
    12. Condition (options & 427228915U /* ~((((((((((0x80000000U | 4U) | 8U) | 0x20000000U) | 0x100U) | 0x2000000U) | 0x4000000U) | 0x10000U) | 0x40000000U) | 0x800000U) | 0x80000U) */) != 0, taking false branch.

    13. Condition (ccontext->extra_options & 4294967155U /* ~((8U | 4U) | 0x80U) */) != 0, taking false branch.
10212 if ((options & PCRE2_LITERAL) != 0 &&
10213    ((options & ~PUBLIC_LITERAL_COMPILE_OPTIONS) != 0 ||
10214     (ccontext->extra_options & ~PUBLIC_LITERAL_COMPILE_EXTRA_OPTIONS) != 0))
10215  {
10216  *errorptr = ERR92;
10217  return NULL;
10218  }
10219
10220 /* A zero-terminated pattern is indicated by the special length value
10221 PCRE2_ZERO_TERMINATED. Check for an overlong pattern. */
10222
    14. Condition patlen == 4294967295U /* ~((size_t)0) */, taking false branch.
    15. Condition zero_terminated = patlen == 4294967295U /* ~((size_t)0) */, taking false branch.
10223 if ((zero_terminated = (patlen == PCRE2_ZERO_TERMINATED)))
10224  patlen = PRIV(strlen)(pattern);
10225
    16. Condition patlen > ccontext->max_pattern_length, taking false branch.
10226 if (patlen > ccontext->max_pattern_length)
10227  {
10228  *errorptr = ERR88;
10229  return NULL;
10230  }
10231
10232 /* From here on, all returns from this function should end up going via the
10233 EXIT label. */
10234
10235
10236 /* ------------ Initialize the "static" compile data -------------- */
10237
    17. Condition ccontext->tables != NULL, taking true branch.
10238 tables = (ccontext->tables != NULL)? ccontext->tables : PRIV(default_tables);
10239
10240 cb.lcc = tables + lcc_offset;          /* Individual */
10241 cb.fcc = tables + fcc_offset;          /*   character */
10242 cb.cbits = tables + cbits_offset;      /*      tables */
10243 cb.ctypes = tables + ctypes_offset;
10244
10245 cb.assert_depth = 0;
10246 cb.bracount = 0;
10247 cb.cx = ccontext;
10248 cb.dupnames = FALSE;
10249 cb.end_pattern = pattern + patlen;
10250 cb.erroroffset = 0;
10251 cb.external_flags = 0;
10252 cb.external_options = options;
10253 cb.groupinfo = stack_groupinfo;
10254 cb.had_recurse = FALSE;
10255 cb.lastcapture = 0;
10256 cb.max_lookbehind = 0;                               /* Max encountered */
10257 cb.max_varlookbehind = ccontext->max_varlookbehind;  /* Limit */
10258 cb.name_entry_size = 0;
10259 cb.name_table = NULL;
10260 cb.named_groups = named_groups;
10261 cb.named_group_list_size = NAMED_GROUP_LIST_SIZE;
10262 cb.names_found = 0;
10263 cb.parens_depth = 0;
10264 cb.parsed_pattern = stack_parsed_pattern;
10265 cb.req_varyopt = 0;
10266 cb.start_code = cworkspace;
10267 cb.start_pattern = pattern;
10268 cb.start_workspace = cworkspace;
10269 cb.workspace_size = COMPILE_WORK_SIZE;
10270
10271 /* Maximum back reference and backref bitmap. The bitmap records up to 31 back
10272 references to help in deciding whether (.*) can be treated as anchored or not.
10273 */
10274
10275 cb.top_backref = 0;
10276 cb.backref_map = 0;
10277
10278 /* Escape sequences \1 to \9 are always back references, but as they are only
10279 two characters long, only two elements can be used in the parsed_pattern
10280 vector. The first contains the reference, and we'd like to use the second to
10281 record the offset in the pattern, so that forward references to non-existent
10282 groups can be diagnosed later with an offset. However, on 64-bit systems,
10283 PCRE2_SIZE won't fit. Instead, we have a vector of offsets for the first
10284 occurrence of \1 to \9, indexed by the second parsed_pattern value. All other
10285 references have enough space for the offset to be put into the parsed pattern.
10286 */
10287
    18. Condition i < 10, taking true branch.
    19. Jumping back to the beginning of the loop.
    20. Condition i < 10, taking true branch.
    21. Jumping back to the beginning of the loop.
    22. Condition i < 10, taking false branch.
10288 for (i = 0; i < 10; i++) cb.small_ref_offset[i] = PCRE2_UNSET;
10289
10290
10291 /* --------------- Start looking at the pattern --------------- */
10292
10293 /* Unless PCRE2_LITERAL is set, check for global one-time option settings at
10294 the start of the pattern, and remember the offset to the actual regex. With
10295 valgrind support, make the terminator of a zero-terminated pattern
10296 inaccessible. This catches bugs that would otherwise only show up for
10297 non-zero-terminated patterns. */
10298
10299 #ifdef SUPPORT_VALGRIND
10300 if (zero_terminated) VALGRIND_MAKE_MEM_NOACCESS(pattern + patlen, CU2BYTES(1));
10301 #endif
10302
10303 ptr = pattern;
10304 skipatstart = 0;
10305
    23. Condition (options & 0x2000000U) == 0, taking false branch.
10306 if ((options & PCRE2_LITERAL) == 0)
10307  {
10308  while (patlen - skipatstart >= 2 &&
10309         ptr[skipatstart] == CHAR_LEFT_PARENTHESIS &&
10310         ptr[skipatstart+1] == CHAR_ASTERISK)
10311    {
10312    for (i = 0; i < sizeof(pso_list)/sizeof(pso); i++)
10313      {
10314      uint32_t c, pp;
10315      const pso *p = pso_list + i;
10316
10317      if (patlen - skipatstart - 2 >= p->length &&
10318          PRIV(strncmp_c8)(ptr + skipatstart + 2, (char *)(p->name),
10319            p->length) == 0)
10320        {
10321        skipatstart += p->length + 2;
10322        switch(p->type)
10323          {
10324          case PSO_OPT:
10325          cb.external_options |= p->value;
10326          break;
10327
10328          case PSO_FLG:
10329          setflags |= p->value;
10330          break;
10331
10332          case PSO_NL:
10333          newline = p->value;
10334          setflags |= PCRE2_NL_SET;
10335          break;
10336
10337          case PSO_BSR:
10338          bsr = p->value;
10339          setflags |= PCRE2_BSR_SET;
10340          break;
10341
10342          case PSO_LIMM:
10343          case PSO_LIMD:
10344          case PSO_LIMH:
10345          c = 0;
10346          pp = skipatstart;
10347          if (!IS_DIGIT(ptr[pp]))
10348            {
10349            errorcode = ERR60;
10350            ptr += pp;
10351            goto HAD_EARLY_ERROR;
10352            }
10353          while (IS_DIGIT(ptr[pp]))
10354            {
10355            if (c > UINT32_MAX / 10 - 1) break;   /* Integer overflow */
10356            c = c*10 + (ptr[pp++] - CHAR_0);
10357            }
10358          if (ptr[pp++] != CHAR_RIGHT_PARENTHESIS)
10359            {
10360            errorcode = ERR60;
10361            ptr += pp;
10362            goto HAD_EARLY_ERROR;
10363            }
10364          if (p->type == PSO_LIMH) limit_heap = c;
10365            else if (p->type == PSO_LIMM) limit_match = c;
10366            else limit_depth = c;
10367          skipatstart += pp - skipatstart;
10368          break;
10369          }
10370        break;   /* Out of the table scan loop */
10371        }
10372      }
10373    if (i >= sizeof(pso_list)/sizeof(pso)) break;   /* Out of pso loop */
10374    }
10375  }
10376
10377 /* End of pattern-start options; advance to start of real regex. */
10378
10379 ptr += skipatstart;
10380
10381 /* Can't support UTF or UCP if PCRE2 was built without Unicode support. */
10382
10383 #ifndef SUPPORT_UNICODE
10384 if ((cb.external_options & (PCRE2_UTF|PCRE2_UCP)) != 0)
10385  {
10386  errorcode = ERR32;
10387  goto HAD_EARLY_ERROR;
10388  }
10389 #endif
10390
10391 /* Check UTF. We have the original options in 'options', with that value as
10392 modified by (*UTF) etc in cb->external_options. The extra option
10393 PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES is not permitted in UTF-16 mode because the
10394 surrogate code points cannot be represented in UTF-16. */
10395
    24. Condition (cb.external_options & 0x80000U) != 0, taking true branch.
10396 utf = (cb.external_options & PCRE2_UTF) != 0;
    25. Condition utf, taking true branch.
10397 if (utf)
10398  {
    26. Condition (options & 0x1000U) != 0, taking false branch.
10399  if ((options & PCRE2_NEVER_UTF) != 0)
10400    {
10401    errorcode = ERR74;
10402    goto HAD_EARLY_ERROR;
10403    }
    27. Condition (options & 0x40000000U) == 0, taking true branch.
    28. Condition (errorcode = _pcre2_valid_utf_8(pattern, patlen, erroroffset)) != 0, taking false branch.
10404  if ((options & PCRE2_NO_UTF_CHECK) == 0 &&
10405       (errorcode = PRIV(valid_utf)(pattern, patlen, erroroffset)) != 0)
10406    goto HAD_ERROR;  /* Offset was set by valid_utf() */
10407
10408 #if PCRE2_CODE_UNIT_WIDTH == 16
10409  if ((ccontext->extra_options & PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES) != 0)
10410    {
10411    errorcode = ERR91;
10412    goto HAD_EARLY_ERROR;
10413    }
10414 #endif
10415  }
10416
10417 /* Check UCP lockout. */
10418
        29. Condition (cb.external_options & 0x20000U) != 0, taking false branch.
10419 ucp = (cb.external_options & PCRE2_UCP) != 0;
    30. Condition ucp, taking false branch.
10420 if (ucp && (cb.external_options & PCRE2_NEVER_UCP) != 0)
10421  {
10422  errorcode = ERR75;
10423  goto HAD_EARLY_ERROR;
10424  }
10425
10426 /* Process the BSR setting. */
10427
    31. Condition bsr == 0, taking true branch.
10428 if (bsr == 0) bsr = ccontext->bsr_convention;
10429
10430 /* Process the newline setting. */
10431
    32. Condition newline == 0, taking true branch.
10432 if (newline == 0) newline = ccontext->newline_convention;
10433 cb.nltype = NLTYPE_FIXED;
    33. Switch case value 1.
10434 switch(newline)
10435  {
10436  case PCRE2_NEWLINE_CR:
10437  cb.nllen = 1;
10438  cb.nl[0] = CHAR_CR;
    34. Breaking from switch.
10439  break;
10440
10441  case PCRE2_NEWLINE_LF:
10442  cb.nllen = 1;
10443  cb.nl[0] = CHAR_NL;
10444  break;
10445
10446  case PCRE2_NEWLINE_NUL:
10447  cb.nllen = 1;
10448  cb.nl[0] = CHAR_NUL;
10449  break;
10450
10451  case PCRE2_NEWLINE_CRLF:
10452  cb.nllen = 2;
10453  cb.nl[0] = CHAR_CR;
10454  cb.nl[1] = CHAR_NL;
10455  break;
10456
10457  case PCRE2_NEWLINE_ANY:
10458  cb.nltype = NLTYPE_ANY;
10459  break;
10460
10461  case PCRE2_NEWLINE_ANYCRLF:
10462  cb.nltype = NLTYPE_ANYCRLF;
10463  break;
10464
10465  default:
10466  errorcode = ERR56;
10467  goto HAD_EARLY_ERROR;
10468  }
10469
10470 /* Pre-scan the pattern to do two things: (1) Discover the named groups and
10471 their numerical equivalents, so that this information is always available for
10472 the remaining processing. (2) At the same time, parse the pattern and put a
10473 processed version into the parsed_pattern vector. This has escapes interpreted
10474 and comments removed (amongst other things).
10475 
10476 In all but one case, when PCRE2_AUTO_CALLOUT is not set, the number of unsigned
10477 32-bit ints in the parsed pattern is bounded by the length of the pattern plus
10478 one (for the terminator) plus four if PCRE2_EXTRA_WORD or PCRE2_EXTRA_LINE is
10479 set. The exceptional case is when running in 32-bit, non-UTF mode, when literal
10480 characters greater than META_END (0x80000000) have to be coded as two units. In
10481 this case, therefore, we scan the pattern to check for such values. */
10482
10483 #if PCRE2_CODE_UNIT_WIDTH == 32
10484 if (!utf)
10485  {
10486  PCRE2_SPTR p;
10487  for (p = ptr; p < cb.end_pattern; p++) if (*p >= META_END) big32count++;
10488  }
10489 #endif
10490
10491 /* Ensure that the parsed pattern buffer is big enough. When PCRE2_AUTO_CALLOUT
10492 is set we have to assume a numerical callout (4 elements) for each character
10493 plus one at the end. This is overkill, but memory is plentiful these days. For
10494 many smaller patterns the vector on the stack (which was set up above) can be
10495 used. */
10496
10497 parsed_size_needed = patlen - skipatstart + big32count;
10498
    35. Condition (ccontext->extra_options & (12U /* 4U | 8U */)) != 0, taking true branch.
10499 if ((ccontext->extra_options &
10500     (PCRE2_EXTRA_MATCH_WORD|PCRE2_EXTRA_MATCH_LINE)) != 0)
10501  parsed_size_needed += 4;
10502
    36. Condition (options & 4U) != 0, taking true branch.
10503 if ((options & PCRE2_AUTO_CALLOUT) != 0)
10504  parsed_size_needed = (parsed_size_needed + 1) * 5;
10505
    37. Condition parsed_size_needed >= 1024, taking false branch.
10506 if (parsed_size_needed >= PARSED_PATTERN_DEFAULT_SIZE)
10507  {
10508  uint32_t *heap_parsed_pattern = ccontext->memctl.malloc(
10509    (parsed_size_needed + 1) * sizeof(uint32_t), ccontext->memctl.memory_data);
10510  if (heap_parsed_pattern == NULL)
10511    {
10512    *errorptr = ERR21;
10513    goto EXIT;
10514    }
10515  cb.parsed_pattern = heap_parsed_pattern;
10516  }
10517 cb.parsed_pattern_end = cb.parsed_pattern + parsed_size_needed + 1;
10518
10519 /* Do the parsing scan. */
10520
10521 errorcode = parse_regex(ptr, cb.external_options, &has_lookbehind, &cb);
    38. Condition errorcode != 0, taking false branch.
10522 if (errorcode != 0) goto HAD_CB_ERROR;
10523
10524 /* If there are any lookbehinds, scan the parsed pattern to figure out their
10525 lengths. Workspace is needed to remember whether numbered groups are or are not
10526 of limited length, and if limited, what the minimum and maximum lengths are.
10527 This caching saves re-computing the length of any group that is referenced more
10528 than once, which is particularly relevant when recursion is involved.
10529 Unnumbered groups do not have this exposure because they cannot be referenced.
10530 If there are sufficiently few groups, the default index vector on the stack, as
10531 set up above, can be used. Otherwise we have to get/free some heap memory. The
10532 vector must be initialized to zero. */
10533
    39. Condition has_lookbehind, taking true branch.
10534 if (has_lookbehind)
10535  {
10536  int loopcount = 0;
    40. Condition cb.bracount >= 128U /* 256 / 2 */, taking false branch.
10537  if (cb.bracount >= GROUPINFO_DEFAULT_SIZE/2)
10538    {
10539    cb.groupinfo = ccontext->memctl.malloc(
10540      (2 * (cb.bracount + 1))*sizeof(uint32_t), ccontext->memctl.memory_data);
10541    if (cb.groupinfo == NULL)
10542      {
10543      errorcode = ERR21;
10544      cb.erroroffset = 0;
10545      goto HAD_CB_ERROR;
10546      }
10547    }
10548  memset(cb.groupinfo, 0, (2 * cb.bracount + 1) * sizeof(uint32_t));
    _**41. write_constant: Write the value 4294967295 into cb.erroroffset.["show details"]**_
10549  errorcode = check_lookbehinds(cb.parsed_pattern, NULL, NULL, &cb, &loopcount);
    _42. Condition errorcode != 0, taking true branch.
    43. Jumping to label HAD_CB_ERROR._
10550  if (errorcode != 0) goto HAD_CB_ERROR;
10551  }
10552
10553 /* For debugging, there is a function that shows the parsed pattern vector. */
10554
10555 #ifdef DEBUG_SHOW_PARSED
10556 fprintf(stderr, "+++ Pre-scan complete:\n");
10557 show_parsed(&cb);
10558 #endif
10559

...

10955 EXIT:
10956 #ifdef SUPPORT_VALGRIND
10957 if (zero_terminated) VALGRIND_MAKE_MEM_DEFINED(pattern + patlen, CU2BYTES(1));
10958 #endif
    _46. Condition cb.parsed_pattern != stack_parsed_pattern, taking true branch._
10959if (cb.parsed_pattern != stack_parsed_pattern)
10960  ccontext->memctl.free(cb.parsed_pattern, ccontext->memctl.memory_data);
    _47. Condition cb.named_group_list_size > 20, taking false branch._
10961if (cb.named_group_list_size > NAMED_GROUP_LIST_SIZE)
10962  ccontext->memctl.free((void *)cb.named_groups, ccontext->memctl.memory_data);
    _49. Condition cb.groupinfo != stack_groupinfo, taking true branch._
10963 if (cb.groupinfo != stack_groupinfo)
10964  ccontext->memctl.free((void *)cb.groupinfo, ccontext->memctl.memory_data);
10965 return re;    /* Will be NULL after an error */
10966
10967 /* Errors discovered in parse_regex() set the offset value in the compile
10968 block. Errors discovered before it is called must compute it from the ptr
10969 value. After parse_regex() is called, the offset in the compile block is set to
10970 the end of the pattern, but certain errors in compile_regex() may reset it if
10971 an offset is available in the parsed pattern. */
10972
10973 HAD_CB_ERROR:

_**CID 1066215: (#5 of 5): Illegal address computation (OVERRUN)

  1. illegaladdress: pattern + cb.erroroffset evaluates to an address that is at byte offset 4294967295 of an array of 1 bytes.**
10974 ptr = pattern + cb.erroroffset;
10975
10976 HAD_EARLY_ERROR:
10977 *erroroffset = ptr - pattern;
10978
10979 HAD_ERROR:
10980 *errorptr = errorcode;
10981 pcre2_code_free(re);
10982 re = NULL;
    _45. Jumping to label EXIT._
10983 goto EXIT;
10984 }
10985
10986 /* These #undefs are here to enable unity builds with CMake. */
carenas commented 5 months ago

It is not uncommon for Coverity to report false positives.

not only Coverity, but most static analyzers get confused by the conditions in the code, which is why it is always better to double check their output fully.

int this case, the confusion seems to come from 99264dfc236e70b0315a16ebc28fc4c5ffa9235f which was included since 10.30, and which was what my proposed draft was meant to hopefully silence; could you apply and run again your scan to see if that would be a possible "solution"?

CrypticEcho commented 5 months ago

Sure, we can try.

CrypticEcho commented 5 months ago

We are running into some logistics delays for testing this patch. I will close this issue for now, and will re-open if it is reproduced.