gluster / glusterfs

Gluster Filesystem : Build your distributed storage in minutes
https://www.gluster.org
GNU General Public License v2.0
4.51k stars 1.07k forks source link

rpc: Improve rpc clnt connection cleanup process #4329

Open mohit84 opened 1 month ago

mohit84 commented 1 month ago

During the first rpc clnt submission we take the rpc reference and register the call_bail function for the timer thread. The timer thread call call_bail function every 10s basis. In case if a client trigger a shutdown request it try to call rpc_clnt_connection_cleanup to cleanup the rpc connection.The rpc_clnt_connection would not be able to cleanup the rpc connection successfully due to the cleanup_started flag being set by the upper xlator. The rpc reference will be unref only after trigger a call_bail function so basically if somehow call_bail is triggered just before start a shutdown process the application has to wait for 10s to cleanup the rpc connection eventually the process becomes slow.

Solution: Unref the rpc object based on the conn->timer/conn->reconnect pointer value as we are doing the same for ping_timer. These pointer are always modified under the critical section so we can assume if pointer is valid it means rpc reference is also valid.

Fixes: #4320 credits: Xavi Hernandez xhernandez@redhat.com Change-Id: Ib947b8bfcbe1b49e1ed05a50a84de6f92afbca13

mohit84 commented 1 month ago

/run regression

mohit84 commented 1 month ago

/run regression

mohit84 commented 1 month ago

/run regression

gluster-ant commented 1 month ago

0 test(s) failed

1 test(s) generated core ./tests/000-flaky/basic_afr_split-brain-favorite-child-policy.t

1 test(s) needed retry ./tests/000-flaky/basic_afr_split-brain-favorite-child-policy.t

1 flaky test(s) marked as success even though they failed ./tests/000-flaky/basic_afr_split-brain-favorite-child-policy.t https://build.gluster.org/job/gh_centos7-regression/3389/

mohit84 commented 1 month ago

/run regression

gluster-ant commented 1 month ago

1 test(s) failed ./tests/basic/ec/ec-badfd.t

0 test(s) generated core

3 test(s) needed retry ./tests/000-flaky/glusterd-restart-shd-mux.t ./tests/basic/afr/ta-shd.t ./tests/basic/ec/ec-badfd.t https://build.gluster.org/job/gh_centos7-regression/3390/

mohit84 commented 1 month ago

/run regression

gluster-ant commented 1 month ago

1 test(s) failed ./tests/basic/ec/ec-badfd.t

0 test(s) generated core

1 test(s) needed retry ./tests/basic/ec/ec-badfd.t https://build.gluster.org/job/gh_centos7-regression/3391/