medic / cht-core

The CHT Core Framework makes it faster to build responsive, offline-first digital health apps that equip health workers to provide better care in their communities. It is a central resource of the Community Health Toolkit.
https://communityhealthtoolkit.org
GNU Affero General Public License v3.0
469 stars 217 forks source link

fix(#9286): don't pass request timeout prop #9634

Closed dianabarsan closed 6 days ago

dianabarsan commented 1 week ago

Description

Removes view indexer request timeout param. This didn't end up terminating the request at haproxy level

9286

9617

8573

Code review checklist

Compose URLs

If Build CI hasn't passed, these may 404:

License

The software is provided under AGPL-3.0. Contributions to this project are accepted under the same license.

dianabarsan commented 1 week ago

@lorerod , I think I need some QA assist magic here to confirm this is working well.

https://github.com/medic/cht-core/issues/9617 this issue has repro steps. Since we're not updating 4.2.2, please test with an upgrade from master / this branch. I created a branch that has a bunch of view changes that you can upgrade to: https://github.com/medic/cht-core/tree/all-view-updates

Appreciate your help and feedback! Thanks!

dianabarsan commented 1 week ago

@lorerod Can I please get some feedback on this? Thanks!

lorerod commented 6 days ago

I think I reproduced what's happening in https://github.com/medic/cht-core/issues/9286.

Steps executed: Install branch master on a large database Screenshot 2024-11-15 at 1 18 05 PM

Upgrade to all-view-updates The upgrade process after a while: Screenshot 2024-11-15 at 5 17 58 PM

CHT application is inaccessible: Screenshot 2024-11-15 at 5 17 52 PM

What I'm not sure is that CouchDB logs do not stop. But haproxy can't connect to it? haproxy logs:

<150>Nov 15 20:23:12 haproxy[12]: 172.18.0.6,<NOSRV>,503,0,0,0,GET,/_session,-,medic,0df838bed6e8,'-',241,-1,-,'like Gecko) Chrome/130.0.0.0 Safari/537.36'
<150>Nov 15 20:23:12 haproxy[12]: 172.18.0.6,<NOSRV>,503,0,0,0,GET,/medic-user-medic-meta/,-,medic,0df838bed6e8,'-',241,-1,-,'like Gecko) Chrome/130.0.0.0 Safari/537.36'
<150>Nov 15 20:23:18 haproxy[12]: 172.18.0.6,<NOSRV>,503,0,3,0,GET,/_session,-,medic,2da25f15cd3a,'-',241,-1,-,'like Gecko) Chrome/130.0.0.0 Safari/537.36'
<150>Nov 15 20:23:18 haproxy[12]: 172.18.0.6,<NOSRV>,503,0,1,0,GET,/_session,-,medic,fbfd5d168a35,'-',241,-1,-,'like Gecko) Chrome/130.0.0.0 Safari/537.36'
<150>Nov 15 20:23:18 haproxy[12]: 172.18.0.6,<NOSRV>,503,0,0,0,GET,/medic-user-medic-meta/,-,medic,2da25f15cd3a,'-',241,-1,-,'like Gecko) Chrome/130.0.0.0 Safari/537.36'

couchdb logs:

[notice] 2024-11-15T20:19:51.303061Z couchdb@127.0.0.1 <0.533.0> -------- ratio_views: enqueueing {<<"shards/d5555552-eaaaaaa6/medic.1731669801">>,<<"_design/:staged:medic-client">>} to compact with priority 4.4375
[notice] 2024-11-15T20:19:55.245643Z couchdb@127.0.0.1 <0.533.0> -------- ratio_views: enqueueing {<<"shards/95555553-aaaaaaa7/medic.1731669801">>,<<"_design/:staged:medic-client">>} to compact with priority 4.5
[notice] 2024-11-15T20:20:20.517251Z couchdb@127.0.0.1 <0.533.0> -------- ratio_views: enqueueing {<<"shards/2aaaaaaa-3ffffffe/medic.1731669801">>,<<"_design/:staged:medic-client">>} to compact with priority 4.5
[notice] 2024-11-15T20:20:36.836359Z couchdb@127.0.0.1 <0.533.0> -------- ratio_views: enqueueing {<<"shards/7ffffffe-95555552/medic.1731669801">>,<<"_design/:staged:medic-client">>} to compact with priority 4.5
[notice] 2024-11-15T20:23:13.019692Z couchdb@127.0.0.1 <0.533.0> -------- ratio_views: enqueueing {<<"shards/15555555-2aaaaaa9/medic.1731669801">>,<<"_design/:staged:medic">>} to compact with priority 3.9375

Now, I will try with this branch instead of with the master.

lorerod commented 6 days ago

Steps executed: Install branch 9286-dont-timeout on the same large database.

Upgrade to all-view-updates The upgrade process was slow but never stopped:

Screenshot 2024-11-15 at 6 26 19 PM

During the process, haproxy didn't lose connection with couchdb: haproxy logs during upgrade:

<150>Nov 15 21:09:24 haproxy[12]: 172.18.0.6,couchdb,200,16,4,0,GET,/medic-sentinel/_local/transitions-seq?,sentinel,sentinel,-,'-',741,16,487,'node-fetch/1.0 (+https://github.com/bitinn/node-fetch)'
<150>Nov 15 21:09:24 haproxy[12]: 172.18.0.7,couchdb,200,238,2,0,GET,/medic-sentinel/_changes?feed=longpoll&heartbeat=10000&since=796269-g1AAAAO1eJyV0jFOwzAUBmC3KWJhAYmFIzAg24kTZ4KZDVofIM9OFIUCEzOHQEwMbKX1JRi4AEsv0TPQmoftsUKyPPySLX9679lzQshRnxlyrB-fdG_givHqguJiczwaNwROlZoNfQZkND67x71DZkRbs2LfnX-kP0fBubWriL14TJctSBCpGDoWLp3bBSz78ljOhGxbnoqh4wCrm8bKFh6rTd5wCakYOjMYrF0GbMI8VlDRyVynYuis4Nm5bWzzPbRZcW2q5Jmhs3uYkFulXiN3EhqtTce61EZRmqKE8YFvESf37UGomeG0SgeXKGH8OPcZK1QelJIaymg6uEUJ40apdQSvPVgVtCt5lw7i8NYYC2s3ARzdBdBIU4oyHcThbTDwB7oAHrx5sGxZ0Yi9X3n4BcjE-7s&limit=25,api,0,-,'-',1238,84,-,'node-fetch/1.0 (+https://github.com/bitinn/node-fetch)'
<150>Nov 15 21:09:25 haproxy[12]: 172.18.0.6,couchdb,201,118,2,0,PUT,/medic-sentinel/_local/transitions-seq,sentinel,sentinel,-,'{"_id":"_local/transitions-seq","_rev":"0-22939","value":"23087-g1AAAAO1eJyV0jtOAzEQBuAVGwivBiQ6CkpEgdb2PisicQJILOode63VKkBFh0THDbhCSHwJWkSfI9DkDCRm8KSMkCwXv-THp7HH4yiKDttYR0fq8Um1GgaMF5cJDjbGpa06ghMpR10bA248vse5fiMSwZN605l_pD9HwoW1M8IOzjyWCQFcsVAMHQtXzq0I278mDErG8uDK0HGA1Q0J2_v0mGYpFJCGYuiMoLN2StjuK2F4R17xUAydGbw4tySs_-2xKsu14DoUQ2f10ItupXwjbufLc5CbIoMkkENpiBLGO_ZiDd55sMzKIlcQDk5Rwvhx7oPA7XMP5ibXxoT2FaUlShg3Us4J7D170KhKN2loO-jx5hgTaxdr8NSDKhVVWTfhID7eAgN_oCMwntCVy4SbbGOPu193xfxR"}',383,118,58,'node-fetch/1.0 (+https://github.com/bitinn/node-fetch)'

couchdb logs during upgrade:

[notice] 2024-11-15T21:09:40.365810Z couchdb@127.0.0.1 <0.31188.21> 77140812fd haproxy:5984 172.18.0.7 medic GET /medic-sentinel/_changes?feed=longpoll&heartbeat=10000&since=796272-g1AAAAO1eJyV0jFOwzAUBmC3KWJhKRILR2BAthMnzgQzG7Q-QF6cKAoFJmYOgdiQ2CD1JRi4AEsv0TPQug_bY1XJ8vBLtvzpvWcvCCEnXaLJtH56rjsN14wXlxQXW-DRuCJwptS87xIgo_H5A-4dMy2akmX77hyQ_h0FF8YsA_bqsDpvQIKIxdAxcGXt1mPJj8NSJmTT8FgMHQtY3SxUNjis1GnFJcRi6MyhN2bw2IQ5LKOilWkdi6GzhBdrN6HND99mwWtdRM8Mne3jhNwp9Ra4U99oqVvWxjaK0gwljC98izC5XwdCyTSnRTw4oITxZ-13qFA5UEqqKaPx4AYljFulVgG8cWCR0TbnbTyIw1thfBqz9uDo3oNa6lzk8SAOb42BP9B68OjdgXnDskrs_cr9Ds4m-74&limit=25 200 ok 3622
[notice] 2024-11-15T21:09:40.513757Z couchdb@127.0.0.1 <0.14827.22> ea6cb762c1 haproxy:5984 172.18.0.6 medic PUT /medic-sentinel/0a7ca875-518b-4757-b335-82417df9df9c-info 201 ok 171
[notice] 2024-11-15T21:09:40.559480Z couchdb@127.0.0.1 <0.31188.21> e35905455b haproxy:5984 172.18.0.6 medic GET /medic-sentinel/_local/transitions-seq? 200 ok 17
[notice] 2024-11-15T21:09:40.603073Z couchdb@127.0.0.1 <0.29231.21> b94b28903b haproxy:5984 172.18.0.7 medic GET /medic-sentinel/_changes?feed=longpoll&heartbeat=10000&since=796273-g1AAAAO1eJyV0jFOwzAUBmC3KWJhKRILR2BAthMnzgQzG7Q-QF6cKAoFJmYOgdiQ2CD1JRg4AUMv0TPQug_bY1XJ8vBLtvzpvWcvCCEnXaLJtH56rjsN14wXlxQXW-DRuCJwptS87xIgo_H5A-4dMy2akmX77hyQ_h0FF8YsA_bqsDpvQIKIxdAxcGXt1mPJj8NSJmTT8FgMHQtY3SxUNjis1GnFJcRi6MyhN2bw2IQ5LKOilWkdi6GzhBdrN6HND99mwWtdRM8Mne3jhNwp9Ra4U99oqVvWxjaK0gwljC98izC5XwdCyTSnRTw4oITxZ-13qFA5UEqqKaPx4AYljFulVgG8cWCR0TbnbTyIw1thfBqz9uDo3oNa6lzk8SAOb42BP9B68OjdgXnDskrs_cr9Ds-M-78&limit=25 200 ok 238
[notice] 2024-11-15T21:09:40.676941Z couchdb@127.0.0.1 <0.14827.22> f584c28b6d haproxy:5984 172.18.0.6 medic PUT /medic-sentinel/_local/transitions-seq 201 ok 112

During the process, the app was accessible: image

So far, everything is looking good. I’ll provide an update once the upgrade process is completed, but there haven’t been any signs of the issue yet.

dianabarsan commented 6 days ago

Thanks so much for the repro, @lorerod !!!

lorerod commented 6 days ago

Fixed in 9286-dont-timeout The upgrade was successful. image