uwrit / leaf

Leaf Clinical Data Explorer
https://www.youtube.com/watch?v=ZuKKC7B8mHI
Other
86 stars 47 forks source link

Production Leaf instance does not allow users to login #555

Closed artgoldberg closed 1 year ago

artgoldberg commented 1 year ago

Hi Folks

Leaf is failing to allow users to login. The login sequence dies during the

... Loading Saved Queries

phase. It concludes with this error

Uh oh. Something went wrong while loading Leaf information from the server. Please contact your Leaf administrator.

The Leaf-API, Apache and shibd servers are happy.

We have deleted some SAML IdP endpoints that were used by a Leaf instance that is no longer running, and the current Leaf instance uses a LeafDB migrated from the Leaf instance that's not running.

No errors are reported in the Leaf log, which is attached. leaf-api-20220823-tail.log

ndobb commented 1 year ago

@artgoldberg can you check the terminal output in the browser console for the versions? Do client / API / DB versions align?

I suspect this is an error solely on the client but hard to guess.

artgoldberg commented 1 year ago

Thanks for your quick response @ndobb . I get this error: image

artgoldberg commented 1 year ago

Versions look ok: image

artgoldberg commented 1 year ago

Errors are being reported by Apache:

root@omop-leaf-temp:[httpd/logs]$ tail leaf_ssl_error_log
[Tue Aug 23 17:58:20.497063 2022] [proxy_http:error] [pid 37177:tid 140671587415808] (70007)The timeout specified has expired: [client 10.254.175.9:38950] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
[Tue Aug 23 17:58:20.497131 2022] [proxy:error] [pid 37177:tid 140671587415808] [client 10.254.175.9:38950] AH00898: Error reading from remote server returned by /api/query, referer: https://leaf.mssm.edu/
[Tue Aug 23 17:59:43.007974 2022] [proxy_http:error] [pid 37177:tid 140671570630400] (70007)The timeout specified has expired: [client 10.254.175.9:15686] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
[Tue Aug 23 17:59:43.008036 2022] [proxy:error] [pid 37177:tid 140671570630400] [client 10.254.175.9:15686] AH00898: Error reading from remote server returned by /api/query, referer: https://leaf.mssm.edu/
[Tue Aug 23 18:29:21.827925 2022] [proxy_http:error] [pid 37177:tid 140671562237696] (70007)The timeout specified has expired: [client 10.254.175.9:34346] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
[Tue Aug 23 18:29:21.828001 2022] [proxy:error] [pid 37177:tid 140671562237696] [client 10.254.175.9:34346] AH00898: Error reading from remote server returned by /api/query, referer: https://leaf.mssm.edu/
[Tue Aug 23 18:37:41.705615 2022] [proxy_http:error] [pid 37175:tid 140671495096064] (70007)The timeout specified has expired: [client 10.254.166.243:48326] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
[Tue Aug 23 18:37:41.705684 2022] [proxy:error] [pid 37175:tid 140671495096064] [client 10.254.166.243:48326] AH00898: Error reading from remote server returned by /api/query, referer: https://leaf.mssm.edu/
[Tue Aug 23 18:39:49.761417 2022] [proxy_http:error] [pid 37177:tid 140671478310656] (70007)The timeout specified has expired: [client 10.254.166.243:37282] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
[Tue Aug 23 18:39:49.761482 2022] [proxy:error] [pid 37177:tid 140671478310656] [client 10.254.166.243:37282] AH00898: Error reading from remote server returned by /api/query, referer: https://leaf.mssm.edu/
ndobb commented 1 year ago

Hmmm so looks like a timeout? Does the client hang for quite a while or does the error seem immediate?

artgoldberg commented 1 year ago

Yes, "the client hangs for quite a while " while working on "Loading Saved Queries" and then dies.

artgoldberg commented 1 year ago

Timeouts are, of course, non-deterministic errors.

These errors are not new, although there are unusually many today, in part because I'm investigating this:

root@omop-leaf-temp:[httpd/logs]$ grep "The timeout specified has expired" *
leaf_ssl_error_log:[Mon Aug 22 01:14:20.179608 2022] [proxy_http:error] [pid 72887:tid 140707767486208] (70007)The timeout specified has expired: [client 10.9.131.107:58504] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log:[Mon Aug 22 19:25:36.803457 2022] [proxy_http:error] [pid 63197:tid 140203788277504] (70007)The timeout specified has expired: [client 10.254.167.182:29064] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log:[Tue Aug 23 11:48:47.909402 2022] [proxy_http:error] [pid 63195:tid 140203763099392] (70007)The timeout specified has expired: [client 10.254.171.219:65213] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log:[Tue Aug 23 14:06:07.863075 2022] [proxy_http:error] [pid 19287:tid 139694255884032] (70007)The timeout specified has expired: [client 10.6.8.157:52378] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log:[Tue Aug 23 14:07:18.302609 2022] [proxy_http:error] [pid 19286:tid 139694239098624] (70007)The timeout specified has expired: [client 10.6.8.157:52419] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log:[Tue Aug 23 14:09:18.595632 2022] [proxy_http:error] [pid 19389:tid 139694222313216] (70007)The timeout specified has expired: [client 10.6.8.157:52501] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log:[Tue Aug 23 14:45:09.730682 2022] [proxy_http:error] [pid 19389:tid 139694222313216] (70007)The timeout specified has expired: [client 10.254.167.29:47370] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log:[Tue Aug 23 14:46:40.063365 2022] [proxy_http:error] [pid 19286:tid 139694532712192] (70007)The timeout specified has expired: [client 10.254.187.57:5821] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log:[Tue Aug 23 14:46:42.959737 2022] [proxy_http:error] [pid 19285:tid 139694264276736] (70007)The timeout specified has expired: [client 10.254.167.29:15730] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log:[Tue Aug 23 14:50:10.007901 2022] [proxy_http:error] [pid 19286:tid 139694272669440] (70007)The timeout specified has expired: [client 10.254.187.57:50089] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log:[Tue Aug 23 14:51:30.743350 2022] [proxy_http:error] [pid 19285:tid 139694415279872] (70007)The timeout specified has expired: [client 10.254.187.57:18631] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log:[Tue Aug 23 14:56:05.844596 2022] [proxy_http:error] [pid 19287:tid 139694180349696] (70007)The timeout specified has expired: [client 10.254.187.57:13739] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log:[Tue Aug 23 15:00:37.329623 2022] [proxy_http:error] [pid 19287:tid 139694264276736] (70007)The timeout specified has expired: [client 10.254.166.243:53024] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log:[Tue Aug 23 15:00:52.089007 2022] [proxy_http:error] [pid 19287:tid 139694364923648] (70007)The timeout specified has expired: [client 10.254.166.243:36130] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log:[Tue Aug 23 15:05:30.592582 2022] [proxy_http:error] [pid 19287:tid 139694515926784] (70007)The timeout specified has expired: [client 10.254.166.243:20004] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log:[Tue Aug 23 17:36:50.699354 2022] [proxy_http:error] [pid 19285:tid 139694524319488] (70007)The timeout specified has expired: [client 10.254.175.9:58292] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log:[Tue Aug 23 17:38:08.129949 2022] [proxy_http:error] [pid 19287:tid 139694415279872] (70007)The timeout specified has expired: [client 10.254.175.9:35122] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log:[Tue Aug 23 17:39:26.406552 2022] [proxy_http:error] [pid 19286:tid 139694247491328] (70007)The timeout specified has expired: [client 10.254.175.9:12842] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log:[Tue Aug 23 17:39:36.948718 2022] [proxy_http:error] [pid 19285:tid 139694507534080] (70007)The timeout specified has expired: [client 10.254.175.9:16064] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log:[Tue Aug 23 17:39:47.329368 2022] [proxy_http:error] [pid 19287:tid 139694499141376] (70007)The timeout specified has expired: [client 10.254.175.9:22434] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log:[Tue Aug 23 17:39:56.348109 2022] [proxy_http:error] [pid 19286:tid 139694281062144] (70007)The timeout specified has expired: [client 10.254.175.9:29090] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log:[Tue Aug 23 17:51:38.773162 2022] [proxy_http:error] [pid 37175:tid 140671646164736] (70007)The timeout specified has expired: [client 10.254.175.9:34108] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log:[Tue Aug 23 17:58:20.497063 2022] [proxy_http:error] [pid 37177:tid 140671587415808] (70007)The timeout specified has expired: [client 10.254.175.9:38950] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log:[Tue Aug 23 17:59:43.007974 2022] [proxy_http:error] [pid 37177:tid 140671570630400] (70007)The timeout specified has expired: [client 10.254.175.9:15686] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log:[Tue Aug 23 18:29:21.827925 2022] [proxy_http:error] [pid 37177:tid 140671562237696] (70007)The timeout specified has expired: [client 10.254.175.9:34346] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log:[Tue Aug 23 18:37:41.705615 2022] [proxy_http:error] [pid 37175:tid 140671495096064] (70007)The timeout specified has expired: [client 10.254.166.243:48326] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log:[Tue Aug 23 18:39:49.761417 2022] [proxy_http:error] [pid 37177:tid 140671478310656] (70007)The timeout specified has expired: [client 10.254.166.243:37282] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log:[Tue Aug 23 19:24:22.521559 2022] [proxy_http:error] [pid 37175:tid 140671654557440] (70007)The timeout specified has expired: [client 10.254.175.9:11578] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log-20220807:[Sat Aug 06 12:04:20.740203 2022] [proxy_http:error] [pid 61195:tid 140701131609856] (70007)The timeout specified has expired: [client 10.254.163.138:16582] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log-20220807:[Sat Aug 06 12:05:41.709673 2022] [proxy_http:error] [pid 61195:tid 140700914022144] (70007)The timeout specified has expired: [client 10.254.163.138:18480] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log-20220814:[Tue Aug 09 14:37:04.772768 2022] [proxy_http:error] [pid 17946:tid 140707633202944] (70007)The timeout specified has expired: [client 10.254.171.192:19175] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log-20220814:[Tue Aug 09 14:37:58.188584 2022] [proxy_http:error] [pid 17947:tid 140707624810240] (70007)The timeout specified has expired: [client 10.254.171.192:7281] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log-20220814:[Tue Aug 09 14:38:10.922879 2022] [proxy_http:error] [pid 18068:tid 140707708737280] (70007)The timeout specified has expired: [client 10.254.171.192:17899] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log-20220814:[Tue Aug 09 14:50:42.833123 2022] [proxy_http:error] [pid 17946:tid 140707784271616] (70007)The timeout specified has expired: [client 10.254.171.192:49651] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log-20220814:[Tue Aug 09 14:54:49.035216 2022] [proxy_http:error] [pid 17946:tid 140707666773760] (70007)The timeout specified has expired: [client 10.254.171.39:63583] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log-20220814:[Tue Aug 09 15:00:49.666610 2022] [proxy_http:error] [pid 17945:tid 140707775878912] (70007)The timeout specified has expired: [client 10.254.171.192:13025] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log-20220814:[Thu Aug 11 08:58:14.505584 2022] [proxy_http:error] [pid 17946:tid 140707809449728] (70007)The timeout specified has expired: [client 10.254.176.185:58221] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log-20220814:[Thu Aug 11 09:37:31.097141 2022] [proxy_http:error] [pid 17945:tid 140707700344576] (70007)The timeout specified has expired: [client 10.254.176.185:64271] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log-20220821:[Tue Aug 16 10:46:31.786003 2022] [proxy_http:error] [pid 1696:tid 140707633202944] (70007)The timeout specified has expired: [client 10.146.26.63:50589] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log-20220821:[Tue Aug 16 10:47:42.937246 2022] [proxy_http:error] [pid 1848:tid 140707801057024] (70007)The timeout specified has expired: [client 10.146.26.63:50604] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log-20220821:[Thu Aug 18 11:30:29.521963 2022] [proxy_http:error] [pid 1848:tid 140707658381056] (70007)The timeout specified has expired: [client 10.125.164.4:64216] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log-20220821:[Thu Aug 18 11:36:03.007615 2022] [proxy_http:error] [pid 1697:tid 140707649988352] (70007)The timeout specified has expired: [client 10.125.164.4:64333] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log-20220821:[Thu Aug 18 12:50:19.708164 2022] [proxy_http:error] [pid 1848:tid 140707700344576] (70007)The timeout specified has expired: [client 10.125.189.8:65079] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
leaf_ssl_error_log-20220821:[Thu Aug 18 12:51:27.982018 2022] [proxy_http:error] [pid 1848:tid 140707649988352] (70007)The timeout specified has expired: [client 10.125.189.8:65086] AH01102: error reading status line from remote server 10.95.46.203:5001, referer: https://leaf.mssm.edu/
artgoldberg commented 1 year ago

As if to prove it is non-deterministic, now 2 clients log in successfully, after spending a long time in "... Loading Saved Queries": image

artgoldberg commented 1 year ago

Hi @ndobb . It seems that this problem can be addressed from both ends:

  1. Increase the length of the timeout
  2. Make the query or queries that get timed out run faster

We know that 2 is a problem in our system. But I don't know how to change 1.

Since this appears to be a non-deterministic problem depends on performance, I'm calling it a day.

Thanks much for your help.

Arthur

ndobb commented 1 year ago

Closing for now, can reopen if needed.