Closed Freaky closed 1 year ago
Can you do p data->kept
and p aux-uid
?
(gdb) p data->kept
$1 = {list = 0x0, num = 0, space = 0}
(gdb) p aux->uid
Cannot access memory at address 0x6f63736943232020
That's an interesting pointer -- it corresponds to the string #Cisco
.
Doesn't appear to be from any email I've received, but it does appear five times in /etc/services.
Can you show me the corresponding -vvvv output?
Unfortunately not. I wish I'd done that before I'd blown away my ability to reproduce it.
I can at least identify the message it was processing (indeed, had successfully delivered 175 times) when it crashed: a FreeBSD commit message. Looks unremarkable and bouncing it back to myself doesn't cause any issue.
OK well if you can't reproduce now there is probably not a lot more we can do.
.rw-r--r-- 43k freaky freaky 24 Jan 22:14 fdm-crash.stderr
.rw-r--r-- 74k freaky freaky 24 Jan 22:14 fdm-crash.stdout
.rw------- 14M freaky freaky 24 Jan 22:14 fdm.core
Identical pointer on aux.
stderr: https://gist.github.com/Freaky/4351d4996b80c993fcda4cbc313eca38
stdout: https://gist.github.com/Freaky/627338733e665ad721694b9bc0127045
I scrubbed my password, don't think there's much else sensitive in there.
Rubbing valgrind on it finds this shortly before the crash:
eda: trying (match) message 86 eda: matching from 0 to 6379 (size=7657, body=6379)
==15056== Invalid write of size 4
==15056== at 0x23895F: re_block (pcre.c:101)
==15056== by 0x235E9A: match_regexp_match (match-regexp.c:62)
==15056== by 0x232C85: mail_match (mail-state.c:149)
==15056== by 0x21FAD0: fetch_match (child-fetch.c:177)
==15056== by 0x21F413: fetch_account (child-fetch.c:386)
==15056== by 0x21F073: child_fetch (child-fetch.c:114)
: eda: trying (match) message 86 eda: matching from 0 to 6379 (size=7657, body=6379)
eda: tried regexp "^X-Cron-Env: ." in headers, result now 0
...
eda: trying (match) message 86
eda: matching from 0 to 6379 (size=7657, body=6379)
==15056== Invalid write of size 4
==15056== at 0x23895F: re_block (pcre.c:101)
==15056== by 0x235E9A: match_regexp_match (match-regexp.c:62)
==15056== by 0x232C85: mail_match (mail-state.c:149)
==15056== by 0x21FAD0: fetch_match (child-fetch.c:177)
==15056== by 0x21F413: fetch_account (child-fetch.c:386)
==15056== by 0x21F073: child_fetch (child-fetch.c:114)
==15056== by 0x220B74: child_start (child.c:133)
==15056== by 0x22834E: main (fdm.c:773)
==15056== Address 0x671a420 is 0 bytes after a block of size 672 alloc'd
==15056== at 0x48500D5: calloc (in /usr/local/libexec/valgrind/vgpreload_memcheck-amd64-freebsd.so)
==15056== by 0x23D25A: xcalloc (xmalloc.c:93)
==15056== by 0x21F5BD: fetch_account (child-fetch.c:437)
==15056== by 0x21F073: child_fetch (child-fetch.c:114)
==15056== by 0x220B74: child_start (child.c:133)
==15056== by 0x22834E: main (fdm.c:773)
==15056==
==15056== Invalid write of size 8
==15056== at 0x23896A: re_block (pcre.c:102)
==15056== by 0x235E9A: match_regexp_match (match-regexp.c:62)
==15056== by 0x232C85: mail_match (mail-state.c:149)
==15056== by 0x21FAD0: fetch_match (child-fetch.c:177)
==15056== by 0x21F413: fetch_account (child-fetch.c:386)
==15056== by 0x21F073: child_fetch (child-fetch.c:114)
==15056== by 0x220B74: child_start (child.c:133)
==15056== by 0x22834E: main (fdm.c:773)
==15056== Address 0x671a428 is 8 bytes after a block of size 672 alloc'd
==15056== at 0x48500D5: calloc (in /usr/local/libexec/valgrind/vgpreload_memcheck-amd64-freebsd.so)
==15056== by 0x23D25A: xcalloc (xmalloc.c:93)
==15056== by 0x21F5BD: fetch_account (child-fetch.c:437)
==15056== by 0x21F073: child_fetch (child-fetch.c:114)
==15056== by 0x220B74: child_start (child.c:133)
==15056== by 0x22834E: main (fdm.c:773)
==15056==
==15056== Invalid write of size 8
==15056== at 0x238972: re_block (pcre.c:103)
==15056== by 0x235E9A: match_regexp_match (match-regexp.c:62)
==15056== by 0x232C85: mail_match (mail-state.c:149)
==15056== by 0x21FAD0: fetch_match (child-fetch.c:177)
==15056== by 0x21F413: fetch_account (child-fetch.c:386)
==15056== by 0x21F073: child_fetch (child-fetch.c:114)
==15056== by 0x220B74: child_start (child.c:133)
==15056== by 0x22834E: main (fdm.c:773)
==15056== Address 0x671a430 is 16 bytes after a block of size 672 alloc'd
==15056== at 0x48500D5: calloc (in /usr/local/libexec/valgrind/vgpreload_memcheck-amd64-freebsd.so)
==15056== by 0x23D25A: xcalloc (xmalloc.c:93)
==15056== by 0x21F5BD: fetch_account (child-fetch.c:437)
==15056== by 0x21F073: child_fetch (child-fetch.c:114)
==15056== by 0x220B74: child_start (child.c:133)
==15056== by 0x22834E: main (fdm.c:773)
==15056==
eda: tried regexp "^Mailing-list: list ([a-z0-9_-]+)@googlegroups" in headers, result now 0
The pid matches:
==15056== Invalid read of size 4
==15056== at 0x22E28C: imap_commit (imap-common.c:273)
==15056== by 0x21FC11: fetch_dequeue (child-fetch.c:725)
==15056== by 0x21FB40: fetch_match (child-fetch.c:191)
==15056== by 0x21F413: fetch_account (child-fetch.c:386)
==15056== by 0x21F073: child_fetch (child-fetch.c:114)
==15056== by 0x220B74: child_start (child.c:133)
==15056== by 0x22834E: main (fdm.c:773)
==15056== Address 0x1f2 is not stack'd, malloc'd or (recently) free'd
==15056==
==15056==
==15056== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==15056== Access not within mapped region at address 0x1F2
Earlier on the same pattern triggers invalid reads:
eda: trying (match) message 61
eda: matching from 0 to 6384 (size=8128, body=6384)
==15056== Invalid read of size 8
==15056== at 0x238953: re_block (pcre.c:99)
==15056== by 0x235E9A: match_regexp_match (match-regexp.c:62)
==15056== by 0x232C85: mail_match (mail-state.c:149)
==15056== by 0x21FAD0: fetch_match (child-fetch.c:177)
==15056== by 0x21F413: fetch_account (child-fetch.c:386)
==15056== by 0x21F073: child_fetch (child-fetch.c:114)
==15056== by 0x220B74: child_start (child.c:133)
==15056== by 0x22834E: main (fdm.c:773)
==15056== Address 0x5a2a938 is 8 bytes after a block of size 112 alloc'd
==15056== at 0x484CBC4: malloc (in /usr/local/libexec/valgrind/vgpreload_memcheck-amd64-freebsd.so)
==15056== by 0x4BF1DB3: ??? (in /usr/local/lib/libpcre2-8.so.0.11.2)
==15056== by 0x4C5BBB0: pcre2_match_data_create_from_pattern_8 (in /usr/local/lib/libpcre2-8.so.0.11.2)
==15056== by 0x2388D5: re_block (pcre.c:87)
==15056== by 0x235E9A: match_regexp_match (match-regexp.c:62)
==15056== by 0x232C85: mail_match (mail-state.c:149)
==15056== by 0x21FAD0: fetch_match (child-fetch.c:177)
==15056== by 0x21F413: fetch_account (child-fetch.c:386)
==15056== by 0x21F073: child_fetch (child-fetch.c:114)
==15056== by 0x220B74: child_start (child.c:133)
==15056== by 0x22834E: main (fdm.c:773)
==15056==
==15056== Invalid read of size 8
==15056== at 0x238959: re_block (pcre.c:99)
==15056== by 0x235E9A: match_regexp_match (match-regexp.c:62)
==15056== by 0x232C85: mail_match (mail-state.c:149)
==15056== by 0x21FAD0: fetch_match (child-fetch.c:177)
==15056== by 0x21F413: fetch_account (child-fetch.c:386)
==15056== by 0x21F073: child_fetch (child-fetch.c:114)
==15056== by 0x220B74: child_start (child.c:133)
==15056== by 0x22834E: main (fdm.c:773)
==15056== Address 0x5a2a930 is 0 bytes after a block of size 112 alloc'd
==15056== at 0x484CBC4: malloc (in /usr/local/libexec/valgrind/vgpreload_memcheck-amd64-freebsd.so)
==15056== by 0x4BF1DB3: ??? (in /usr/local/lib/libpcre2-8.so.0.11.2)
==15056== by 0x4C5BBB0: pcre2_match_data_create_from_pattern_8 (in /usr/local/lib/libpcre2-8.so.0.11.2)
==15056== by 0x2388D5: re_block (pcre.c:87)
==15056== by 0x235E9A: match_regexp_match (match-regexp.c:62)
==15056== by 0x232C85: mail_match (mail-state.c:149)
==15056== by 0x21FAD0: fetch_match (child-fetch.c:177)
==15056== by 0x21F413: fetch_account (child-fetch.c:386)
==15056== by 0x21F073: child_fetch (child-fetch.c:114)
==15056== by 0x220B74: child_start (child.c:133)
==15056== by 0x22834E: main (fdm.c:773)
==15056==
==15056== Invalid read of size 8
==15056== at 0x238966: re_block (pcre.c:102)
==15056== by 0x235E9A: match_regexp_match (match-regexp.c:62)
==15056== by 0x232C85: mail_match (mail-state.c:149)
==15056== by 0x21FAD0: fetch_match (child-fetch.c:177)
==15056== by 0x21F413: fetch_account (child-fetch.c:386)
==15056== by 0x21F073: child_fetch (child-fetch.c:114)
==15056== by 0x220B74: child_start (child.c:133)
==15056== by 0x22834E: main (fdm.c:773)
==15056== Address 0x5a2a930 is 0 bytes after a block of size 112 alloc'd
==15056== at 0x484CBC4: malloc (in /usr/local/libexec/valgrind/vgpreload_memcheck-amd64-freebsd.so)
==15056== by 0x4BF1DB3: ??? (in /usr/local/lib/libpcre2-8.so.0.11.2)
==15056== by 0x4C5BBB0: pcre2_match_data_create_from_pattern_8 (in /usr/local/lib/libpcre2-8.so.0.11.2)
==15056== by 0x2388D5: re_block (pcre.c:87)
==15056== by 0x235E9A: match_regexp_match (match-regexp.c:62)
==15056== by 0x232C85: mail_match (mail-state.c:149)
==15056== by 0x21FAD0: fetch_match (child-fetch.c:177)
==15056== by 0x21F413: fetch_account (child-fetch.c:386)
==15056== by 0x21F073: child_fetch (child-fetch.c:114)
==15056== by 0x220B74: child_start (child.c:133)
==15056== by 0x22834E: main (fdm.c:773)
==15056==
==15056== Invalid read of size 8
==15056== at 0x23896E: re_block (pcre.c:103)
==15056== by 0x235E9A: match_regexp_match (match-regexp.c:62)
==15056== by 0x232C85: mail_match (mail-state.c:149)
==15056== by 0x21FAD0: fetch_match (child-fetch.c:177)
==15056== by 0x21F413: fetch_account (child-fetch.c:386)
==15056== by 0x21F073: child_fetch (child-fetch.c:114)
==15056== by 0x220B74: child_start (child.c:133)
==15056== by 0x22834E: main (fdm.c:773)
==15056== Address 0x5a2a938 is 8 bytes after a block of size 112 alloc'd
==15056== at 0x484CBC4: malloc (in /usr/local/libexec/valgrind/vgpreload_memcheck-amd64-freebsd.so)
==15056== by 0x4BF1DB3: ??? (in /usr/local/lib/libpcre2-8.so.0.11.2)
==15056== by 0x4C5BBB0: pcre2_match_data_create_from_pattern_8 (in /usr/local/lib/libpcre2-8.so.0.11.2)
==15056== by 0x2388D5: re_block (pcre.c:87)
==15056== by 0x235E9A: match_regexp_match (match-regexp.c:62)
==15056== by 0x232C85: mail_match (mail-state.c:149)
==15056== by 0x21FAD0: fetch_match (child-fetch.c:177)
==15056== by 0x21F413: fetch_account (child-fetch.c:386)
==15056== by 0x21F073: child_fetch (child-fetch.c:114)
==15056== by 0x220B74: child_start (child.c:133)
==15056== by 0x22834E: main (fdm.c:773)
==15056==
eda: tried regexp "^X-Cron-Env: ." in headers, result now 0
Does this fix it?
--- a/pcre.c
+++ b/pcre.c
@@ -93,6 +93,8 @@ re_block(struct re *re, const void *buf, size_t len, struct rmlist *rml,
}
if (rml != NULL) {
+ if (res > NPMATCH)
+ res = NPMATCH;
ovector = pcre2_get_ovector_pointer(pmd);
for (i = 0; i < res; i++) {
j = i * 2;
Seems to have fixed the crash. I'll keep an eye on it.
I did note this, though it also appears in the unpatched run:
eda: matching from 0 to 3385 (size=3548, body=3385)
==3299== Conditional jump or move depends on uninitialised value(s)
==3299== at 0x23896D: re_block (pcre.c:101)
==3299== by 0x235E9A: match_regexp_match (match-regexp.c:62)
==3299== by 0x232C85: mail_match (mail-state.c:149)
==3299== by 0x21FAD0: fetch_match (child-fetch.c:177)
==3299== by 0x21F413: fetch_account (child-fetch.c:386)
==3299== by 0x21F073: child_fetch (child-fetch.c:114)
==3299== by 0x220B74: child_start (child.c:133)
==3299== by 0x22834E: main (fdm.c:773)
==3299==
eda: tried regexp "^X-Cron-Env: ." in headers, result now 0
Line 101 on the patched file is this conditional from inside the loop over res:
if (ovector[j + 1] <= ovector[j])
Same crash dereferencing the same pointer (remarkably consistent), and similar valgrind errors on a different regexp:
eda: matching from 0 to 2286 (size=77546, body=2286)
==12129== Invalid write of size 4
==12129== at 0x23896F: re_block (pcre.c:103)
==12129== by 0x235E9A: match_regexp_match (match-regexp.c:62)
==12129== by 0x232C85: mail_match (mail-state.c:149)
==12129== by 0x21FAD0: fetch_match (child-fetch.c:177)
==12129== by 0x21F413: fetch_account (child-fetch.c:386)
==12129== by 0x21F073: child_fetch (child-fetch.c:114)
==12129== by 0x220B74: child_start (child.c:133)
==12129== by 0x22834E: main (fdm.c:773)
==12129== Address 0x6084580 is 0 bytes after a block of size 672 alloc'd
==12129== at 0x48500D5: calloc (in /usr/local/libexec/valgrind/vgpreload_memcheck-amd64-freebsd.so)
==12129== by 0x23D26A: xcalloc (xmalloc.c:93)
==12129== by 0x21F5BD: fetch_account (child-fetch.c:437)
==12129== by 0x21F073: child_fetch (child-fetch.c:114)
==12129== by 0x220B74: child_start (child.c:133)
==12129== by 0x22834E: main (fdm.c:773)
==12129==
==12129== Invalid write of size 8
==12129== at 0x23897A: re_block (pcre.c:104)
==12129== by 0x235E9A: match_regexp_match (match-regexp.c:62)
==12129== by 0x232C85: mail_match (mail-state.c:149)
==12129== by 0x21FAD0: fetch_match (child-fetch.c:177)
==12129== by 0x21F413: fetch_account (child-fetch.c:386)
==12129== by 0x21F073: child_fetch (child-fetch.c:114)
==12129== by 0x220B74: child_start (child.c:133)
==12129== by 0x22834E: main (fdm.c:773)
==12129== Address 0x6084588 is 8 bytes after a block of size 672 alloc'd
==12129== at 0x48500D5: calloc (in /usr/local/libexec/valgrind/vgpreload_memcheck-amd64-freebsd.so)
==12129== by 0x23D26A: xcalloc (xmalloc.c:93)
==12129== by 0x21F5BD: fetch_account (child-fetch.c:437)
==12129== by 0x21F073: child_fetch (child-fetch.c:114)
==12129== by 0x220B74: child_start (child.c:133)
==12129== by 0x22834E: main (fdm.c:773)
==12129==
==12129== Invalid write of size 8
==12129== at 0x238982: re_block (pcre.c:105)
==12129== by 0x235E9A: match_regexp_match (match-regexp.c:62)
==12129== by 0x232C85: mail_match (mail-state.c:149)
==12129== by 0x21FAD0: fetch_match (child-fetch.c:177)
==12129== by 0x21F413: fetch_account (child-fetch.c:386)
==12129== by 0x21F073: child_fetch (child-fetch.c:114)
==12129== by 0x220B74: child_start (child.c:133)
==12129== by 0x22834E: main (fdm.c:773)
==12129== Address 0x6084590 is 16 bytes after a block of size 672 alloc'd
==12129== at 0x48500D5: calloc (in /usr/local/libexec/valgrind/vgpreload_memcheck-amd64-freebsd.so)
==12129== by 0x23D26A: xcalloc (xmalloc.c:93)
==12129== by 0x21F5BD: fetch_account (child-fetch.c:437)
==12129== by 0x21F073: child_fetch (child-fetch.c:114)
==12129== by 0x220B74: child_start (child.c:133)
==12129== by 0x22834E: main (fdm.c:773)
==12129==
eda: tried regexp "^List-Id:.*(css-d|org-[a-z]+|notmuch|whatwg|www-(?:html|style)|ruby-talk)" in headers, result now 0
Are you sure you have the fix applied for that test?
I had the same thought and double-checked. I added the patch to my port tree, verified it applied correctly, and force-rebuilt and reinstalled the package. I made sure there's no other fdm
in my PATH.
Valgrind's reported line numbers also match the patched file:
103 rml->list[i].valid = 1;
104 rml->list[i].so = ovector[j];
105 rml->list[i].eo = ovector[j + 1];
The loop conditional has me a little suspicious:
99 for (i = 0; i < res; i++) {
And indeed it triggers a warning when compiled with debug mode:
pcre.c:99:17: warning: comparison of integers of different signs: 'u_int' (aka 'unsigned int') and 'int' [-Wsign-compare]
for (i = 0; i < res; i++) {
Doesn't C implicitly convert the signed side to unsigned in this case? And -1
is explicitly allowed in the prior conditional, which will cast to MAX_INT:
89 if (res < 0 && res != PCRE2_ERROR_NOMATCH) {
I'm a bit confused by the code flow here - why is PCRE2_ERROR_NOMATCH
special-cased to run the res
loop and skip pcre2_match_data_free
? Shouldn't it free and return 0 instead of falling through?
res cannot be < 0 because if would have returned on line 92.
I expect we should not run the loop if res is PCRE2_ERROR_NOMATCH. What is PCRE2_ERROR_NOMATCH defined to by PCRE?
OK PCRE2_ERROR_NOMATCH is -1. I would probably just check for that explicitly after the res < 0 check and ditch the ugly return !=.
The res = 0 case also looks like an error condition - the list of offsets was too small, which pcre2_match_data_create_from_pattern
should avoid?
Also there's a missing null pointer check on that.
How about something like:
--- pcre.c.orig 2023-01-09 11:20:17 UTC
+++ pcre.c
@@ -66,7 +66,7 @@ int
re_block(struct re *re, const void *buf, size_t len, struct rmlist *rml,
char **cause)
{
- int res;
+ int res, ret;
pcre2_match_data *pmd;
PCRE2_SIZE *ovector;
u_int i, j;
@@ -85,27 +85,36 @@ re_block(struct re *re, const void *buf, size_t len, s
}
pmd = pcre2_match_data_create_from_pattern(re->pcre2, NULL);
+ if (pmd == NULL)
+ fatalx("pcre2_match_data_create_from_pattern failed");
+
res = pcre2_match(re->pcre2, buf, len, 0, 0, pmd, NULL);
- if (res < 0 && res != PCRE2_ERROR_NOMATCH) {
- xasprintf(cause, "%s: regexec failed", re->str);
- pcre2_match_data_free(pmd);
- return (-1);
- }
- if (rml != NULL) {
- ovector = pcre2_get_ovector_pointer(pmd);
- for (i = 0; i < res; i++) {
- j = i * 2;
- if (ovector[j + 1] <= ovector[j])
- break;
- rml->list[i].valid = 1;
- rml->list[i].so = ovector[j];
- rml->list[i].eo = ovector[j + 1];
+ if (res > 0) {
+ if (res > NPMATCH)
+ res = NPMATCH;
+ if (rml != NULL) {
+ ovector = pcre2_get_ovector_pointer(pmd);
+ for (i = 0; i < res; i++) {
+ j = i * 2;
+ if (ovector[j + 1] <= ovector[j])
+ break;
+ rml->list[i].valid = 1;
+ rml->list[i].so = ovector[j];
+ rml->list[i].eo = ovector[j + 1];
+ }
+ rml->valid = 1;
}
- rml->valid = 1;
+ ret = 1;
+ } else if (res == PCRE2_ERROR_NOMATCH) {
+ ret = 0;
+ } else {
+ xasprintf(cause, "%s: regexec failed", re->str);
+ ret = -1;
}
- return (res != PCRE2_ERROR_NOMATCH);
+ pcre2_match_data_free(pmd);
+ return (ret);
}
void
I remain baffled at if (ovector[j + 1] <= ovector[j])
. Stop if the next offset is smaller than the current one? Surely the loop conditional is enough - or would be if it was defended against the -1 case.
This looks ok although I'd prefer to just return early rather than adding ret, although there isn't much in it.
I'll be back home on Wed and I will look at this again and apply it then (assuming it fixes your problem).
On Sat, 28 Jan 2023, 13:35 Thomas Hurst, @.***> wrote:
How about something like:
--- pcre.c.orig 2023-01-09 11:20:17 UTC +++ pcre.c @@ -66,7 +66,7 @@ int re_block(struct re re, const void buf, size_t len, struct rmlist *rml, char **cause) {
int res;
int res, ret; pcre2_match_data pmd; PCRE2_SIZE ovector; u_int i, j; @@ -85,27 +85,36 @@ re_block(struct re re, const void buf, size_t len, s }
pmd = pcre2_match_data_create_from_pattern(re->pcre2, NULL);
if (pmd == NULL)
fatalx("pcre2_match_data_create_from_pattern failed");
res = pcre2_match(re->pcre2, buf, len, 0, 0, pmd, NULL);
if (res < 0 && res != PCRE2_ERROR_NOMATCH) {
xasprintf(cause, "%s: regexec failed", re->str);
pcre2_match_data_free(pmd);
return (-1);
}
if (rml != NULL) {
ovector = pcre2_get_ovector_pointer(pmd);
for (i = 0; i < res; i++) {
j = i * 2;
if (ovector[j + 1] <= ovector[j])
break;
rml->list[i].valid = 1;
rml->list[i].so = ovector[j];
rml->list[i].eo = ovector[j + 1];
if (res > 0) {
if (res > NPMATCH)
res = NPMATCH;
if (rml != NULL) {
ovector = pcre2_get_ovector_pointer(pmd);
for (i = 0; i < res; i++) {
j = i * 2;
if (ovector[j + 1] <= ovector[j])
break;
rml->list[i].valid = 1;
rml->list[i].so = ovector[j];
rml->list[i].eo = ovector[j + 1];
}
rml->valid = 1; }
rml->valid = 1;
ret = 1;
} else if (res == PCRE2_ERROR_NOMATCH) {
ret = 0;
} else {
xasprintf(cause, "%s: regexec failed", re->str);
ret = -1; }
return (res != PCRE2_ERROR_NOMATCH);
pcre2_match_data_free(pmd);
return (ret); }
void
I remain baffled at if (ovector[j + 1] <= ovector[j]). Stop if the next offset is smaller than the current one? Surely the loop conditional is enough - or would be if it was defended against the -1 case.
— Reply to this email directly, view it on GitHub https://github.com/nicm/fdm/issues/122#issuecomment-1407400541, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKI42WZN6B5GUEVJUKTCLWUUOCNANCNFSM6AAAAAAUAYZ4CQ . You are receiving this because you commented.Message ID: @.***>
I went with ret
over early return to avoid having multiple calls to pcre2_match_data_free
in different branches. Maybe that's better than having both res
and ret
in the same scope.
One other thing I noticed from the pcre2_match
man page: The length and startoffset values are code units, not characters
. So len
should be a count of UTF-8 code points and not the current size in bytes? Or does that depend on how pcre2 was compiled?
At least that isn't a memory safety issue if you undercount it...
What. No, a byte size would be >= the number of codepoints in a multibyte string.
PCRE2 considers a "code unit" to be one of 8, 16, or 32 bytes, and doesn't conflate that with code points.
Patched version fetched my backlog without any of the aforementioned Valgrind warnings and without crashing.
Lots of warnings about leaking 64 bytes, but previous entries like ==12129== definitely lost: 103,764 bytes in 870 blocks
are no longer evident.
Regarding if (ovector[j + 1] <= ovector[j])
, I came across this in pcre2demo which does help explain things a bit:
/* Since release 10.38 PCRE2 has locked out the use of \K in lookaround
assertions. However, there is an option to re-enable the old behaviour. If that
is set, it is possible to run patterns such as /(?=.\K)/ that use \K in an
assertion to set the start of a match later than its end. In this demonstration
program, we show how to detect this case, but it shouldn't arise because the
option is never set. */
if (ovector[0] > ovector[1])
If there's an invariant that list[i].so < list[i].eo it does actually need defending from that - though perhaps it would be better to do so by swapping the values?
I note that because the condition is <=
it will also break the loop early if one of the groups matched the empty string - that seems like a bug to me.
Please give me your change as a .diff file because it is impossible to copy from GH issues. You will need to name it .txt before it will let you attach it.
I was going to make a PR, but here you go: patch-pcre.c.txt
I have applied this with trivial style changes. I also changed to < to permit so == eo. Please let me know if you still have problems. Thanks!
Thank you :)
I've deployed the updated patch and will let you know if anything else crops up. I don't expect it to - I've ran the prior patch for several days without issue.
Great, I will close this but you can still comment if needed.
Backtrace from fdm 2.2:
Just built a copy with -O0 to try to maximise debug info and it helpfully just fetched everything and stopped me being able to reproduce this, but I still have the core if you have suggestions about how else to poke it.