389ds / 389-ds-base

The enterprise-class Open Source LDAP server for Linux
https://www.port389.org/
Other
210 stars 90 forks source link

agmt_count in Replica could become (PRUint64)-1 #919

Closed 389-ds-bot closed 4 years ago

389-ds-bot commented 4 years ago

Cloned from Pagure issue: https://pagure.io/389-ds-base/issue/47582


agmt_count in Replica could become (PRUint64)-1 (gdb) p r->agmt_count $2 = 18446744073709551615 (gdb) p (int)r->agmt_count $3 = -1

The entire replica that includes agmt_count == (PRUint64)-1:

(gdb) p *r
$1 = {repl_root = 0x7f702f2102e0, repl_name = 0x7f702f035cf0 "dfa8f703-464f11e3-b993ea7f-4d6a3997", new_name = 0,
updatedn_list = 0x7f702f1cef20, repl_type = REPLICA_TYPE_UPDATABLE, legacy_consumer = 0, legacy_purl = 0x0, repl_rid = 1,
repl_ruv = 0x7f702f213360, repl_ruv_dirty = 0, min_csn_pl = 0x0, csn_pl_reg_id = 0x7f702f1cc940, repl_state_flags = 0,
repl_flags = 1, repl_lock = 0x7f702f1bfe40, repl_eqcxt_rs = 0x7f702f2055d0, repl_eqcxt_tr = 0x0,
repl_csngen = 0x7f702f20fa60, repl_csn_assigned = 0, repl_purge_delay = 0, tombstone_reap_stop = 0,
tombstone_reap_active = 0, tombstone_reap_interval = 0, repl_referral = 0x0, state_update_inprogress = 0,
agmt_lock = 0x7f702f201d20, locking_purl = 0x0, protocol_timeout = 120, backoff_min = 3, backoff_max = 300,
agmt_count = 18446744073709551615}

Location where agmt_count became (PRUint64)-1

(gdb) bt
0 0x00007f70227dc438 in replica_decr_agmt_count (r=0x7f702f1e8750) at ldap/servers/plugins/replication/repl5_replica.c:3971
1 0x00007f70227bdf1e in agmt_delete (rap=0x7f6ff37e9320) at ldap/servers/plugins/replication/repl5_agmt.c:606
2 0x00007f70227bdd49 in agmt_new_from_entry (e=0x7f6f8c001080) at ldap/servers/plugins/replication/repl5_agmt.c:535
3 0x00007f70227c37af in add_new_agreement (e=0x7f6f8c001080) at ldap/servers/plugins/replication/repl5_agmtlist.c:151
4 0x00007f70227c38f9 in agmtlist_add_callback (pb=0x7f6ff37edae0, e=0x7f6f8c001080, entryAfter=0x0,
returncode=0x7f6ff37e9574, returntext=0x7f6ff37e9600 "", arg=0x0) at ldap/servers/plugins/replication/repl5_agmtlist.c:188
5 0x00007f702c8384b8 in dse_call_callback (pdse=0x7f702ee63160, pb=0x7f6ff37edae0, operation=16, flags=1,
entryBefore=0x7f6f8c001080, entryAfter=0x0, returncode=0x7f6ff37e9574, returntext=0x7f6ff37e9600 "")
at ldap/servers/slapd/dse.c:2421
6 0x00007f702c8378c0 in dse_add (pb=0x7f6ff37edae0) at ldap/servers/slapd/dse.c:2171
7 0x00007f702c81b34e in op_shared_add (pb=0x7f6ff37edae0) at ldap/servers/slapd/add.c:735
8 0x00007f702c81a274 in do_add (pb=0x7f6ff37edae0) at ldap/servers/slapd/add.c:258
9 0x00007f702cd5b639 in connection_dispatch_operation (conn=0x7f702cbd7410, op=0x7f702f1bea20, pb=0x7f6ff37edae0)
at ldap/servers/slapd/connection.c:643
10 0x00007f702cd5d6a4 in connection_threadmain () at ldap/servers/slapd/connection.c:2508
11 0x00007f702ae65c76 in _pt_root (arg=0x7f702f0c9180) at ../../../nspr/pr/src/pthreads/ptthread.c:204
12 0x00007f702a808d15 in start_thread (arg=0x7f6ff37ee700) at pthread_create.c:308
13 0x00007f702a32553d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

The value 18446744073709551615 is used to initialize smod in which huge size of mod is allocated and it terminates the server. Stacktrace where the server exits with calloc failure:

0 0x00007f427bced18c in slapi_ch_calloc (nelem=18446744073709551615, size=8) at ldap/servers/slapd/ch_malloc.c:251
1 0x00007f427bd3d397 in slapi_mod_init (smod=0x7f4257ffebd0, initCount=-2) at ldap/servers/slapd/modutil.c:597
2 0x00007f4271c89812 in agmt_maxcsn_to_smod (r=0x7f427d3bb750, smod=0x7f4257ffebd0)
at ldap/servers/plugins/replication/repl5_agmt.c:2808
3 0x00007f4271ca069e in replica_write_ruv (r=0x7f427d3bb750) at ldap/servers/plugins/replication/repl5_replica.c:2608
4 0x00007f4271ca04ae in replica_update_state (when=1383615645, arg=0x7f427d3cee50)
at ldap/servers/plugins/replication/repl5_replica.c:2559
5 0x00007f427bd0bc74 in eq_call_all () at ldap/servers/slapd/eventq.c:312
6 0x00007f427bd0be1e in eq_loop (arg=0x0) at ldap/servers/slapd/eventq.c:359
7 0x00007f427a32cc76 in _pt_root (arg=0x7f427d3c69c0) at ../../../nspr/pr/src/pthreads/ptthread.c:204
8 0x00007f4279ccfd15 in start_thread (arg=0x7f4257fff700) at pthread_create.c:308
9 0x00007f42797ec53d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:114

In the first stacktrace, this agmt_delete is called for an error handling. Probably, if it is an error, we do not want to decrement agmt_count all the time?

1 0x00007f70227bdf1e in agmt_delete (rap=0x7f6ff37e9320) at ldap/servers/plugins/replication/repl5_agmt.c:606
389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2013-11-08 03:09:58

attachment 0001-Ticket-47582-agmt_count-in-Replica-could-become-PRUi.patch

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2013-11-08 03:18:30

git merge ticket47582 Updating 8eecc43..d2aa2bd Fast-forward ldap/servers/plugins/replication/repl5_agmt.c | 10 ++++++++++ ldap/servers/plugins/replication/repl5_agmtlist.c | 1 - ldap/servers/plugins/replication/repl5_replica.c | 4 +++- 3 files changed, 13 insertions(+), 2 deletions(-)

git push origin master Counting objects: 17, done. Delta compression using up to 4 threads. Compressing objects: 100% (9/9), done. Writing objects: 100% (9/9), 1.20 KiB, done. Total 9 (delta 7), reused 0 (delta 0) To ssh://git.fedorahosted.org/git/389/ds.git 8eecc43..d2aa2bd master -> master

commit d2aa2bd3e0ecea84722d829f5f7c9ff0033ffaf8 Author: Mark Reynolds mreynolds389@redhat.com Date: Thu Nov 7 16:09:21 2013 -0500

389-ds-bot commented 4 years ago

Comment from mreynolds (@mreynolds389) at 2017-02-11 23:12:33

Metadata Update from @mreynolds389: