sysrepo / sysrepo

YANG-based configuration and operational state data store for Unix/Linux applications
http://www.sysrepo.org
BSD 3-Clause "New" or "Revised" License
355 stars 235 forks source link

Get operation is timeout by write upgrade locking #3446

Open n-toshikazu opened 2 weeks ago

n-toshikazu commented 2 weeks ago

We met NETCONF get operation timeouts.

Syslog that is set with 10min timeouts for netopeer2-server -t 600 and our application sysrepo APIs.

2024-10-31T01:47:34.759170+00:00 netopeer2-server[3935]: EV ORIGIN: "/org-openroadm-device:org-openroadm-device/circuit-packs" "oper get" index 0 ID 644 published. 2024-10-31T01:57:34.759361+00:00 netopeer2-server[3935]: EV ORIGIN: "oper get" ID 644 processing timed out. 2024-10-31T01:57:34.759413+00:00 netopeer2-server[3935]: EV ORIGIN: "/org-openroadm-device:org-openroadm-device/circuit-packs" "oper get" index 0 ID 644 failed (Timeout expired). 2024-10-31T01:57:34.777778+00:00 netopeer2-server[3935]: Session 2: Sending message:#012applicationoperation-failederrorEV ORIGIN: "oper get" ID 644 processing timed out.applicationoperation-failederrorUser callback failed.

A timeout is caused by rwlock between three contexts accessing the same YANG module. ctx-A our application : execute sr_apply_changes() to change datastore ctx-B netopeer2-server: execute sr_get_data() that invoke "oper get" callback event ctx-C our application : execute sr_get_item() in "oper get" application callback

Problem: all contexts are waiting until any timeout is elapsed...

Lock sequence to become timeout seq#1. ctx-A get SR_LOCK_READ_UPGR at sr_modinfo_consolidate() seq#2. ctx-B get SR_LOCK_READ at sr_modinfo_consolidate() and invoke "oper get" => waiting for ctx-C(seq#4) response seq#3. ctx-A upgrade to SR_LOCK_WRITE_UPGR at sr_changes_notify_store() => can not upgrade, waiting for ctx-B(seq#2) SR_LOCK_READ unlock with setting "rwlock->writer = cid" seq#4. ctx-C get SR_LOCK_READ from sr_get_item() calls => can not get, waiting for ctx-A(seq#3) to reset "rwlock->writer = 0"

Use SR_LOCK_WRITE instead of SR_LOCK_WRITE_URGE

I tried to change SR_LOCK_WRITE_URGE to SR_LOCK_WRITE in sr_shmmod_modinfo_rdlock_upgrade(), timeout is cleared.

Case of SR_LOCK_WRITE is able to wait to keep rwlock->writer=0, so ctx-C is no need to wait ctx-A.

    while (!ret && (rwlock->readers[1] || (rwlock->read_count[0] > 1) || rwlock->writer)) {
        /* COND WAIT */
        ret = sr_cond_clockwait(&rwlock->cond, &rwlock->mutex, COMPAT_CLOCK_ID, &timeout_abs);
    }
michalvasko commented 1 week ago

Using the WRITE_URGE lock here has actually solved another synchronization issue when writers were being starved by readers. There are several internal goals here competing with each other such as not holding unnecessary locks, minimizing dead lock use-cases, or making functions wait as little as possible. It is almost impossible to make this balance perfect so the current state is where we got after many improvements and changes. And your use-case seems a fairly complex one so I will have to ask you to improve your application design instead of making changes to sysrepo. Sorry, but I appreciate the obvious effort you made at learning all the details of this problem.

n-toshikazu commented 1 week ago

I understand that sysrepo exclusive is getting optimized and it is important for exclusiveness to keep least wait features (even with complex sysrepo api usage unsafe).

We try to improve our program so that data provider callback won't use get-item apis at all. Callback implementation shouldn't relay on sysrepo datastore as its conditional data source now, instead, application stores the same data to local storage before being callback-ed.

Thanks.