cyrusimap / cyrus-imapd

Cyrus IMAP is an email, contacts and calendar server
http://cyrusimap.org
Other
534 stars 146 forks source link

httpd killed by SIGSEGV for calendar request #4216

Closed sdalu closed 1 year ago

sdalu commented 2 years ago

imapd version: 3.4.4 OS: FreeBSD 13.1

config fragment:

httpmodules: caldav carddav freebusy ischedule jmap
httpprettytelemetry: yes
caldav_realm: foo.com
carddav_realm: foo.com
calendar_user_address_set: foo.com
caldav_allowcalendaradmin: 1

When trying to access a calendar using:

curl -u xxx:yyy http://<server>/dav/calendars/user/<user>/793edf22-716-1497341807796-172307

Backtrace

root@incal:/usr/ports/devel/libical # lldb -c /var/imap/httpd.core /usr/local/cyrus/libexec/httpd
(lldb) target create "/usr/local/cyrus/libexec/httpd" --core "/var/imap/httpd.core"
Core file '/var/imap/httpd.core' (x86_64) was loaded.
(lldb) bt
This version of LLDB has no plugin for the language "assembler". Inspection of frame variables will be limited.
* thread #1, name = 'httpd', stop reason = signal SIGSEGV
  * frame #0: 0x00000008019fa8f0 libc.so.7`strcmp at strcmp.S:60
    frame #1: 0x0000000801671445 libical.so.3`icalcomponent_isa_component(component=0x0000000800000000) at icalcomponent.c:333:9
    frame #2: 0x0000000801670957 libical.so.3`icalcomponent_add_children(impl=0x000000080244daf0, args=0x00007fffffff9c80) at icalcomponent.c:76:9
    frame #3: 0x0000000801670898 libical.so.3`icalcomponent_vanew(kind=ICAL_VCALENDAR_COMPONENT) at icalcomponent.c:133:5
    frame #4: 0x0000000800b4740f libcyrus_imap.so.0`icalcomponent_new_stream(mailbox=0x0000000802571408, prodid="-//CyrusIMAP.org/Cyrus 3.4.4//EN", name="Work", desc=0x0000000000000000, color="") at ical_support.c:532:12
    frame #5: 0x000000000023ecd9 httpd`begin_icalendar(buf=0x00007fffffffd5f0, mailbox=0x0000000802571408, prodid="-//CyrusIMAP.org/Cyrus 3.4.4//EN", name="Work", desc=0x0000000000000000, color="") at http_caldav.c:1801:12
    frame #6: 0x000000000023ffc3 httpd`export_calendar(txn=0x00007fffffffa4a8) at http_caldav.c:2004:11
    frame #7: 0x000000000023d6ab httpd`caldav_get(txn=0x00007fffffffa4a8, mailbox=0x0000000000000000, record=0x0000000000000000, data=0x0000000000000000, obj=0x0000000000000000) at http_caldav.c:2807:16
    frame #8: 0x000000000025b343 httpd`meth_get_head(txn=0x00007fffffffa4a8, params=0x00000000002c5f80) at http_dav.c:5220:34
    frame #9: 0x00000000002368d6 httpd`meth_get_head_cal(txn=0x00007fffffffa4a8, params=0x00000000002c5f80) at http_caldav.c:2869:12
    frame #10: 0x0000000000293a61 httpd`process_request(txn=0x00007fffffffa4a8) at httpd.c:1827:15
    frame #11: 0x000000000029b59a httpd`http1_input(txn=0x00007fffffffa4a8) at httpd.c:1893:11
    frame #12: 0x0000000000291679 httpd`cmdloop(conn=0x00007fffffffdae0) at httpd.c:2046:13
    frame #13: 0x00000000002910fa httpd`service_main(argc=3, argv=0x000000080241e000, envp=0x00007fffffffe9f0) at httpd.c:963:5
    frame #14: 0x00000000002bb494 httpd`main(argc=3, argv=0x00007fffffffe9d0, envp=0x00007fffffffe9f0) at service.c:647:9
    frame #15: 0x0000000000231db5 httpd`_start(ap=<unavailable>, cleanup=<unavailable>) at crt1_c.c:75:7
dilyanpalauzov commented 2 years ago

This is similar to https://github.com/cyrusimap/cyrus-imapd/issues/3511#issuecomment-862341824. Which authentication do you use: basic, digest-md5, scram, cram-md5, ntlm, gss/spnego?

sdalu commented 2 years ago

It looks similar. I'm using basic. I've tried removing properties in ical_support.c/icalcomponent_new_stream and if the one related to lastmodified is removed, it works fine

    ical = icalcomponent_vanew(ICAL_VCALENDAR_COMPONENT,
                               icalproperty_new_version("2.0"),
                               icalproperty_new_prodid(prodid),
                               icalproperty_new_uid(buf_cstring(&buf)),
//                               icalproperty_new_lastmodified(
//                                   icaltime_from_timet_with_zone(mailbox->index_mtime,
//                                                                 0, NULL)),
                               icalproperty_new_name(name),
                               0);
dilyanpalauzov commented 2 years ago

I suspect mailbox->index_mtime has a strange value. Could you add above that line

syslog(LOG_CRIT, "icalcomponent_new_stream %lu", (unsigned long) mailbox->index_mtime);

and run the code again. The question is what is the last logged value before the crash.

sdalu commented 2 years ago

Tried on different calendar. One of them is an empty newly created.

http[23106]: icalcomponent_new_stream 1661280150
http[23106]: icalcomponent_new_stream 1661197235
http[23106]: icalcomponent_new_stream 1661163281
dilyanpalauzov commented 2 years ago

It is unclear from the above message, whether the crash is on a single or all calendars.

Can you run Cyrus IMAP under valgrind:

valgrind --leak-check=full --num-callers=40 --track-origins=yes --read-var-info=yes --show-leak-kinds=all --track-fds=yes --error-limit=no --trace-children=yes /usr/local/libexec/master -D -M /etc/cyrus/cyrus.conf -C /etc/cyrus/imapd.conf

This will report error on the first place, where memory is misused (if memory handling is the problem).

sdalu commented 2 years ago

Every calendar crash. Even newly created one.

How should I send you the file ?

dilyanpalauzov commented 2 years ago

Can't the file be uploaded here? My email is dpa-github domain aegee.org.

sdalu commented 2 years ago

Oops, didn't realize in was possible.

I used --log-file=/tmp/log-%p.txt for the log file, I only attached the one related to httpd: log-71562.txt.gz

dilyanpalauzov commented 2 years ago

You have several:

==71562== Invalid read of size 8
==71562==    at 0x5E9557C: ??? (in /lib/libc.so.7)
==71562==    by 0x5E95778: setproctitle (in /lib/libc.so.7)
==71562==    by 0x506D9C2: proc_register (proc.c:164)
==71562==    by 0x291D47: examine_request (httpd.c:1793)
==71562==    by 0x29B54B: http1_input (httpd.c:1886)
==71562==    by 0x291668: cmdloop (httpd.c:2046)
==71562==    by 0x2910E9: service_main (httpd.c:963)
==71562==    by 0x2BB483: main (service.c:647)
==71562==  Address 0x7fffffffefe0 is not stack'd, malloc'd or (recently) free'd
==71562==

For me the presence of setproctitle and /lib/libc.so.7 is an indicatior that you are using a form of BSD.

Then follows:

==71562== Invalid write of size 8
==71562==    at 0x5E9559C: ??? (in /lib/libc.so.7)
==71562==    by 0x5E95778: setproctitle (in /lib/libc.so.7)
==71562==    by 0x506D9C2: proc_register (proc.c:164)
==71562==    by 0x291D47: examine_request (httpd.c:1793)
==71562==    by 0x29B54B: http1_input (httpd.c:1886)
==71562==    by 0x291668: cmdloop (httpd.c:2046)
==71562==    by 0x2910E9: service_main (httpd.c:963)
==71562==    by 0x2BB483: main (service.c:647)
==71562==  Address 0x7fffffffefe0 is not stack'd, malloc'd or (recently) free'd
==71562==
==71562== Invalid read of size 1
==71562==    at 0x48536D4: strcmp (in /usr/local/libexec/valgrind/vgpreload_memcheck-amd64-freebsd.so)
==71562==    by 0x5BBD5FE: icalcomponent_vanew (in /usr/local/lib/libical.so.3.0.8)
==71562==    by 0x509A40E: icalcomponent_new_stream (ical_support.c:532)
==71562==    by 0x23ECC8: begin_icalendar (http_caldav.c:1801)
==71562==    by 0x23FFB2: export_calendar (http_caldav.c:2004)
==71562==    by 0x23D69A: caldav_get (http_caldav.c:2807)
==71562==    by 0x25B332: meth_get_head (http_dav.c:5220)
==71562==    by 0x2368C5: meth_get_head_cal (http_caldav.c:2869)
==71562==    by 0x293A50: process_request (httpd.c:1827)
==71562==    by 0x29B589: http1_input (httpd.c:1893)
==71562==    by 0x291668: cmdloop (httpd.c:2046)
==71562==    by 0x2BB483: main (service.c:647)
==71562==  Address 0x800000000 is not stack'd, malloc'd or (recently) free'd

I would say here the many invalid read and invalid writes mess the memory (destroy the libc-internal structure of what is allocated and what not) and the following invalid read in icalcomponent_vanew is a consequence of the previous invalid reads and writes.

I do not use BSD, so I cannot help here. But as the invalid reads and writes all are coming from proc.c:proc_register() → setproctitle(), does the problem disappear if the line

164: setproctitle("%s: %s %s %s %s", servicename, clienthost, userid, mailbox, cmd); is removed?

is removed? Can you limit this further, by removing one %s and then one of the next parameters? (or more %s and more of the next parameters) to see which of servicename, clienthost, userid, mailbox, cmd is misused (improperly initialized)?

sdalu commented 2 years ago

I'm on FreeBSD 13.1

I commented the mentioned line from proc.c, but it didn't help. I've attached the new log: log-13349.txt.gz

dilyanpalauzov commented 2 years ago

I assume that BSD’s setproctitle() is a problem on its own, but apparently that problem does not interfere here. Now the first error is:

==13349== Memcheck, a memory error detector
==13349== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==13349== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==13349== Command: /usr/local/cyrus/libexec/httpd -p 9
==13349== Parent PID: 13341
==13349== 
==13349== Invalid read of size 1
==13349==    at 0x48536D4: strcmp (in /usr/local/libexec/valgrind/vgpreload_memcheck-amd64-freebsd.so)
==13349==    by 0x5BBD5FE: icalcomponent_vanew (in /usr/local/lib/libical.so.3.0.8)
==13349==    by 0x509A39E: icalcomponent_new_stream (ical_support.c:532)
==13349==    by 0x23ECD8: begin_icalendar (http_caldav.c:1801)
==13349==    by 0x23FFC2: export_calendar (http_caldav.c:2004)
==13349==    by 0x23D6AA: caldav_get (http_caldav.c:2807)
==13349==    by 0x25B342: meth_get_head (http_dav.c:5220)
==13349==    by 0x2368D5: meth_get_head_cal (http_caldav.c:2869)
==13349==    by 0x293A60: process_request (httpd.c:1827)
==13349==    by 0x29B599: http1_input (httpd.c:1893)
==13349==    by 0x291678: cmdloop (httpd.c:2046)
==13349==    by 0x2910F9: service_main (httpd.c:963)
==13349==    by 0x2BB493: main (service.c:647)
==13349==  Address 0x800000000 is not stack'd, malloc'd or (recently) free'd
==13349== 
==13349== 
==13349== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==13349==  Access not within mapped region at address 0x800000000

What happens with this file i.c:

#include <libical/ical.h>

int main() {
    icalcomponent_vanew(ICAL_VCALENDAR_COMPONENT,
      icalproperty_new_version("2.0"),
      icalproperty_new_prodid("PROD-ABC"),
      icalproperty_new_uid("1234abcd"),
      icalproperty_new_lastmodified(
        icaltime_from_timet_with_zone(1661280150, 0, NULL)),
      icalproperty_new_name("name1"),
      0);
    return 0;
}

if it is compiled with

cc $(pkg-config --cflags libical --libs libical) -o libical-test i.c

and libical-test is run under valgrind?

valgrind --leak-check=full --num-callers=40 --track-origins=yes --read-var-info=yes --show-leak-kinds=all --track-fds=yes --error-limit=no --trace-children=yes ./libical-test

For comparison on my system pkg-config --cflags libical --libs libical prints -I/usr/local/include -L/usr/local/lib64 -lical -licalss -licalvcal.

Which is your version on libical?

dilyanpalauzov commented 2 years ago

Concerning setproctitle() in imap/proc.c:proc_register(), with the change below

diff --git a/imap/proc.c b/imap/proc.c
index e3f597171..259285cae 100644
--- a/imap/proc.c
+++ b/imap/proc.c
@@ -161,6 +161,11 @@ EXPORTED int proc_register(const char *servicename, const char *clienthost,
         fatal("can't write proc file", EX_IOERR);
     }

+    syslog(LOG_CRIT, "proc_register %s", servicename);
+    syslog(LOG_CRIT, "proc_register %s", clienthost);
+    syslog(LOG_CRIT, "proc_register %s", userid);
+    syslog(LOG_CRIT, "proc_register %s", mailbox);
+    syslog(LOG_CRIT, "proc_register %s", cmd);
     setproctitle("%s: %s %s %s %s", servicename, clienthost, userid, mailbox, cmd);

     free(newfname);

when the code is run under Valgrind, as you did previously, valgrind will print the line number where not correctly initialized data is read. From the line number the problematic variable can be obtained.

As an example, your first valgrind-output started with

==71562== Invalid read of size 8
==71562==    at 0x5E9557C: ??? (in /lib/libc.so.7)
==71562==    by 0x5E95778: setproctitle (in /lib/libc.so.7)
==71562==    by 0x506D9C2: proc_register (proc.c:164)
==71562==    by 0x290F10: service_main (httpd.c:921)
==71562==    by 0x2BB483: main (service.c:647)
==71562==  Address 0x7fffffffefe0 is not stack'd, malloc'd or (recently) free'd

and at that point line proc.c:164 was problematic.

sdalu commented 2 years ago

libical version: 3.0.8 (I previously try to link httpd with libical 3.0.14, result was the same)

Yes, it is segfaulting: log.txt

Result of pkg-config --cflags libical --libs libical:

-I/usr/local/include -L/usr/local/lib -lical -licalss -licalvcal 
dilyanpalauzov commented 2 years ago

1661280150 is a plausible unix-timestamp:

$ date -d @1661280150
Tue Aug 23 09:42:30 PM EEST 2022

The crash is not because of cyrus-imap. It is caused by libical. Please report the crash using one of the options suggested at the end of https://github.com/libical/libical/ . A reference to https://github.com/cyrusimap/cyrus-imapd/issues/4216#issuecomment-1227284115 with the follow-up should be sufficient.

The most recent code on the 3.0 branch of libical has a fix for:

  • Fix CMake option USE_32BIT_TIME_T actually uses a 32-bit time_t value

not included in 3.0.14. That change might help (this is just a guess).

sdalu commented 2 years ago

I'll move the bug report to libical.

Here the result of syslog for setproctitle:

Aug 25 16:24:25 incal http[55187]: proc_register http/caldav
Aug 25 16:24:25 incal http[55187]: proc_register [unix socket]
Aug 25 16:24:25 incal http[55187]: proc_register sdalu@sdalu.com
Aug 25 16:24:25 incal http[55187]: proc_register 
Aug 25 16:24:25 incal http[55187]: proc_register GET
dilyanpalauzov commented 2 years ago

Here the result of syslog for setproctitle:

My question was rather, when that code is run under valgrind, on which line does valgrind report an error.

sdalu commented 2 years ago

If you expected an error on one of the syslog, doesn't seem to be the case.

==98188== Invalid read of size 8
==98188==    at 0x5E9557C: ??? (in /lib/libc.so.7)
==98188==    by 0x5E95778: setproctitle (in /lib/libc.so.7)
==98188==    by 0x506DA45: proc_register (proc.c:169)
==98188==    by 0x290F20: service_main (httpd.c:921)
==98188==    by 0x2BB493: main (service.c:647)
==98188==  Address 0x7fffffffefe0 is not stack'd, malloc'd or (recently) free'd

Here is the full log : log-98188.txt.gz

dilyanpalauzov commented 2 years ago

If you expected an error on one of the syslog, doesn't seem to be the case.

My reading is that for a file spt.c:

#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>

int main() {
  char *a = "http/caldav", *b = "[unix socket]", *c = "sdalu@sdalu.com", *d = "", *e = "GET";
  printf("%s %s %s %s %s\n", a, b, c, d, e);
  setproctitle("%s %s %s %s %s", a, b, c, d, e);
  return 0;
}

when run under valgrind:

cc spt.c -o spt valgrind ./spt

the latter reports an invalid read error on the setproctitle()-line (:8). This shall be brought to your libc vendor.

dilyanpalauzov commented 2 years ago

Rectification: I removed the terminating \n from the setproctitle line.

sdalu commented 2 years ago

I created a bug report on the freebsd bug tracker system for setproctitle https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=266036

sdalu commented 2 years ago

the latter reports an invalid read error on the setproctitle()-line (:8). This shall be brought to your libc vendor.

Following bug report answer, that's the correct behaviour of setproctitle as it is modifying kernel initialized string defined at sysctl -n kern.ps_strings

paulfloyd commented 2 years ago

Please see my comment in the Valgrind bugzilla https://bugs.kde.org/show_bug.cgi?id=458378 [I'm the maintainer of Valgrind on FreeBSD]

This should work with the most recent Valgrind ('valgrind-devel' rather than just 'valgrind' in the FreeBSD ports system).

dilyanpalauzov commented 1 year ago

https://github.com/cyrusimap/cyrus-imapd/pull/4278, https://github.com/cyrusimap/cyrus-imapd/pull/4279 and https://github.com/cyrusimap/cyrus-imapd/pull/4280 are supposed to correct this on the master, 3.6 and 3.4 branches. As suggested at https://github.com/libical/libical/issues/585 the last parameter of icalcomponent_vanew() was 0, but must be NULL.