cyrusimap / cyrus-imapd

Cyrus IMAP is an email, contacts and calendar server
http://cyrusimap.org
Other
530 stars 145 forks source link

cyrus-murder problems with database corruption in the frontend/master #739

Closed brong closed 13 years ago

brong commented 19 years ago

From: 32044 Bugzilla-Id: 2640 Version: 2.2.x Owner: Ken Murchison

brong commented 19 years ago

From: 32044

We currently have 1.200.000+ mailboxes split in 3 backend servers using 1 frontend / master server (both services running on the same server) for a grand total of 3.400.000+ mailboxes and subfolders.

the servers are used for imap access only. no pop3, no sieve scripts, no nntp, nothing else besides imap.

Now, the problem in question.

after some days running smoothly I start getting errors on the master/frontend server. It has happened a few times now..

So far I've seen it happen only on the master/frontend server.. not on the backend servers.

the errors:

cyrus/mupdate[1433]: DBERROR: skiplist recovery /var/lib/imap/mailboxes.db: 141BAD94 should be ADD or DELETE cyrus/mupdate[1433]: DBERROR: error updating database user.1215407: cyrusdb error

then after some time I start getting these:

cyrus/lmtp[1276]: authentication to remote mupdate server failed: EOF from server cyrus/lmtp[1276]: couldn't connect to 10.1.5.101: no authentication to server

after a little more time

cyrus/imap[30969]: kick_mupdate: can't connect to target: Connection refused cyrus/lmtp[2240]: mupdate-client: connection to server closed: end of file reached cyrus/lmtp[2240]: couldn't connect to 10.1.5.101: no connection to server

then this is what I get when trying to restart cyrus frontend/master

cyrus/ctl_cyrusdb[30607]: recovering cyrus databases cyrus/ctl_cyrusdb[30607]: DBERROR: skiplist recovery /var/lib/imap/mailboxes.db: 141BAD94 should be ADD or DELETE cyrus/ctl_cyrusdb[30607]: DBERROR: opening /var/lib/imap/mailboxes.db: cyrusdb error

what I need to do next is completely delete the dbs from the master/frontend and reimport all the mailboxes.

Thank you very much,

João Assad


Below is my frontend/master configuration

cyrus.conf

START {

do not delete this entry!

recover cmd="ctl_cyrusdb -r" }

UNIX sockets start with a slash and are put into /var/lib/imap/sockets

SERVICES {

add or remove based on preferences

mupdate cmd="mupdate -m" listen=3905 prefork=1 lmtp cmd="lmtpproxyd" listen="lmtp" prefork=0 imap cmd="proxyd" listen="imap" prefork=0 }

EVENTS {

this is required

checkpoint cmd="ctl_cyrusdb -c" period=240 }

imapd.conf

configdirectory: /var/lib/imap partition-default: /tmp admins: cyrus sievedir: /var/lib/imap/sieve sendmail: /usr/sbin/sendmail hashimapspool: true sasl_pwcheck_method: saslauthd sasl_mech_list: PLAIN tls_cert_file: /usr/share/ssl/certs/crt.crt tls_key_file: /usr/share/ssl/certs/key.key tls_ca_file: /usr/share/ssl/certs/ca.ca

allowusermoves: 1

Backend servers

cyrus-be1_password: cyrus-be2_password: cyrus-be3_password: *** proxy_authname: cyrus

Mupdate server

mupdate_server:10.1.5.101 mupdate_authname:cyrus mupdate_password: ***

maxmessagesize: 2097152 syslog_prefix:cyrus lmtp_over_quota_perm_failure:1 quotawarn: 110 imapidlepoll: 0 fulldirhash: 1 munge8bit: 0 timeout: 10

tls_session_timeout: 0 mupdate_connections_max: 1024 berkeley_cachesize:102400 berkeley_txns_max:500

Fedora bugzilla bug id 152548 https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=152548

brong commented 19 years ago

From: 32044

New info

It seems before the corruption , I always get the following error

cyrus/mupdate[12614]: IOERROR: mapping /var/lib/imap/mailboxes.db file: Cannot allocate memory cyrus/mupdate[12614]: failed to mmap /var/lib/imap/mailboxes.db file cyrus/master[12580]: service mupdate pid 12614 in READY state: terminated abnormally

This error is given by the map_refresh method ... there are 2 declarations of the method. one in lib/map_stupidshared.c and another in lib/map_shared.c .

I dont know which one cyrus use when compiled under fedora.

Regards

brong commented 19 years ago

From: 32044

2 gdb backtraces


18988 0x0804dcd3 in fatal (

s=0x8d52f070 "Internal error: assertion failed: mupdate.c: 586: 0", code=75) at mupdate.c:586

18989 0x08082622 in assertionfailed (file=0x8082ab9 "mupdate.c", line=586,

expr=0x8082ce1 "0") at assert.c:61

18990 0x0804dcd3 in fatal (

s=0x8d52f4c0 "failed to mmap /var/lib/imap/mailboxes.db file", code=75) at mupdate.c:586

18991 0x080755f5 in map_refresh (fd=8, onceonly=0, base=0x8ad56e0,

len=0x8ad56e4, newlen=366993408, name=0x8ad5710 "/var/lib/imap/mailboxes.db", mboxname=0x0) at map_shared.c:105

18992 0x08078178 in update_lock (db=0x8ad56d8, txn=0x8d52f6e0)

at cyrusdb_skiplist.c:572

18993 0x0807a2f9 in mycommit (db=0x8ad56d8, tid=0x8d52f6e0)

at cyrusdb_skiplist.c:1345

18994 0x08079d64 in mystore (db=0x8ad56d8, key=0x8b41ba8 "user.1440034",

keylen=12, data=0x8c5a248 "1 cyrus-be2.gazzag.com!default 1440034\tlrswipcda\t", datalen=49, tid=0x0, overwrite=1) at cyrusdb_skiplist.c:1225

18995 0x08079de5 in store (db=0x8ad56d8, key=0x8b41ba8 "user.1440034",

keylen=12, data=0x8c5a248 "1 cyrus-be2.gazzag.com!default 1440034\tlrswipcda\t", datalen=49, tid=0x0) at cyrusdb_skiplist.c:1244

18996 0x080571a9 in mboxlist_insertremote (name=0x8b41ba8 "user.1440034",

mbtype=0, host=0x8b144f0 "cyrus-be2.gazzag.com!default", acl=0x8c4dab8 "1440034\tlrswipcda\t", tid=0x0) at mboxlist.c:801

18997 0x0804f869 in database_log (mb=0x8c4daa8, mytid=0x0) at mupdate.c:1300

18998 0x0804ffe3 in cmd_set (C=0x8bdd778, tag=0x8be8918 "X1",

mailbox=0x8bb6ec8 "user.1440034", server=0x8bb54b8 "cyrus-be2.gazzag.com!default", acl=0x8c36e28 "1440034\tlrswipcda\t", t=SET_ACTIVE) at mupdate.c:1527

18999 0x0804e1f0 in docmd (c=0x8bdd778) at mupdate.c:685

19000 0x0804f669 in thread_main (rock=0x0) at mupdate.c:1228

19001 0x0069b98c in start_thread () from /lib/tls/libpthread.so.0

19002 0x005897da in clone () from /lib/tls/libc.so.6


18988 0x0804dcd3 in fatal (

s=0x9fcd6070 "Internal error: assertion failed: mupdate.c: 586: 0", code=75) at mupdate.c:586

18989 0x08082622 in assertionfailed (file=0x8082ab9 "mupdate.c", line=586,

expr=0x8082ce1 "0") at assert.c:61

18990 0x0804dcd3 in fatal (

s=0x9fcd64c0 "failed to mmap /var/lib/imap/mailboxes.db file", code=75) at mupdate.c:586

18991 0x080755f5 in map_refresh (fd=8, onceonly=0, base=0x85856e0,

len=0x85856e4, newlen=366583808, name=0x8585710 "/var/lib/imap/mailboxes.db", mboxname=0x0) at map_shared.c:105

18992 0x08078178 in update_lock (db=0x85856d8, txn=0x9fcd66e0)

at cyrusdb_skiplist.c:572

18993 0x0807a2f9 in mycommit (db=0x85856d8, tid=0x9fcd66e0)

at cyrusdb_skiplist.c:1345

18994 0x08079d64 in mystore (db=0x85856d8,

key=0x88046a78 "user.950836._TRASH", keylen=18, data=0x88056cc0 "1 cyrus-be3.gazzag.com!default 950836\tlrswipcda\t", datalen=48, tid=0x0, overwrite=1) at cyrusdb_skiplist.c:1225

18995 0x08079de5 in store (db=0x85856d8, key=0x88046a78 "user.950836._TRASH",

keylen=18, data=0x88056cc0 "1 cyrus-be3.gazzag.com!default 950836\tlrswipcda\t", datalen=48, tid=0x0) at cyrusdb_skiplist.c:1244

18996 0x080571a9 in mboxlist_insertremote (

name=0x88046a78 "user.950836._TRASH", mbtype=0, host=0x88019a70 "cyrus-be3.gazzag.com!default", acl=0x8804b7c0 "950836\tlrswipcda\t", tid=0x0) at mboxlist.c:801

18997 0x0804f869 in database_log (mb=0x8804b7b0, mytid=0x0) at mupdate.c:1300

18998 0x0804ffe3 in cmd_set (C=0x9dd0b9b8, tag=0x9dd9b010 "X1",

mailbox=0x9ddb1140 "user.950836._TRASH", server=0x9ddb5360 "cyrus-be3.gazzag.com!default", acl=0x88016f70 "950836\tlrswipcda\t", t=SET_ACTIVE) at mupdate.c:1527

18999 0x0804e1f0 in docmd (c=0x9dd0b9b8) at mupdate.c:685

19000 0x0804f669 in thread_main (rock=0x0) at mupdate.c:1228

19001 0x0069b98c in start_thread () from /lib/tls/libpthread.so.0

19002 0x005897da in clone () from /lib/tls/libc.so.6

brong commented 19 years ago

Attachment-Id: 348 From: 32044 Type: text/plain File: map_shared-mremap.patch

patch to replace munmap followed by mmap for a mremap

brong commented 18 years ago

From: Sergio Bruder

any additional work on this particular bug? the attached patch really solves it? any plan to include it in the next release?

brong commented 18 years ago

From: Ken Murchison

Both the problem and solution appear to be GNU and/or FC specific. At the very least, the use of mremap() has to be a compile-time decision, with all of the necessary configure foo.

brong commented 18 years ago

From: Sergio Bruder

you mean linux (the kernel) specific. Ive saw this very same problem in CentOS (ok, not far from a FC) and SuSE.

brong commented 14 years ago

From: Wes Craig

Please re-open if this continues to be an issue.