CDCgov / prime-reportstream

ReportStream is a public intermediary tool for delivery of data between different parts of the healthcare ecosystem.
https://reportstream.cdc.gov
Creative Commons Zero v1.0 Universal
71 stars 40 forks source link

WA/CO/OK/ND/LA Production Failures - 1/26/23 and 1/27/23 #8100

Closed wcollin89 closed 1 year ago

wcollin89 commented 1 year ago

Problem statement

[1;31m22:36:16.222 [pool-2-thread-1138][39571] FATAL-ALERT gov.cdc.prime.router.azure.SendFunction - FAILED POST of inputReportId cc010d1b-cf4b-4d06-b56a-e82c82142a3b to apiUrl=https://prd-v2-onehealthport-api.axwaycloud.com/doh/phchub/PHC-Hub/elr (orgService = wa-phd.rest), Exception: Connect timeout has expired [url=https://prd-v2-onehealthport-api.axwaycloud.com/ohp/oauth/jwt/token, connect_timeout=unknown ms], All retries failed. Manual Intervention Required. Send Error report for: cc010d1b-cf4b-4d06-b56a-e82c82142a3b to wa-phd.rest[m

[1;31m10:32:38.284 [pool-2-thread-999][55508] FATAL-ALERT gov.cdc.prime.router.azure.SendFunction - FAILED Sftp upload of inputReportId 57deaf41-b847-4958-af68-9838d6fbe6d5 to SFTPTransportType(host=test.moveit.state.co.us, port=22, filePath=./, credentialName=null) (orgService = co-phd.elr-secondary), Exception: Connection timed out (Connection timed out), All retries failed. Manual Intervention Required. Send Error report for: 57deaf41-b847-4958-af68-9838d6fbe6d5 to co-phd.elr-secondary[m '--- GOOD

[1;31m08:38:10.763 [pool-2-thread-90][707] FATAL-ALERT gov.cdc.prime.router.azure.SendFunction - FAILED POST of inputReportId 546cb634-fb21-45fb-a0e0-fbe00b30370a to apiUrl=https://prd-v2-onehealthport-api.axwaycloud.com/doh/phchub/PHC-Hub/elr (orgService = wa-phd.rest), Exception: Connect timeout has expired [url=https://prd-v2-onehealthport-api.axwaycloud.com/ohp/oauth/jwt/token, connect_timeout=unknown ms], All retries failed. Manual Intervention Required. Send Error report for: 546cb634-fb21-45fb-a0e0-fbe00b30370a to wa-phd.rest[m

'--- OK-PHD.ELR is good now --- [1;31m05:53:44.312 [pool-2-thread-104][414] FATAL-ALERT gov.cdc.prime.router.azure.SendFunction - FAILED POST of inputReportId 6f715684-16a0-43be-a773-2802b8e2da77 to apiUrl=https://labupload.health.ok.gov/api/document/hl7 (orgService = ok-phd.elr), Exception: Connect timeout has expired [url=https://labupload.health.ok.gov/api/auth/token, connect_timeout=unknown ms], All retries failed. Manual Intervention Required. Send Error report for: 6f715684-16a0-43be-a773-2802b8e2da77 to ok-phd.elr[m

[1;31m02:29:40.963 [pool-2-thread-1305][42378] FATAL-ALERT gov.cdc.prime.router.azure.SendFunction - FAILED Sftp upload of inputReportId d9d5bf17-4860-48ad-86ea-ba3da9122e9a to SFTPTransportType(host=mft.nd.gov, port=22, filePath=/Home/dohdcmsg/nddoh_elr/hl7, credentialName=null) (orgService = nd-doh.elr), Exception: Connection timed out (Connection timed out), All retries failed. Manual Intervention Required. Send Error report for: d9d5bf17-4860-48ad-86ea-ba3da9122e9a to nd-doh.elr[m

[1;31m02:38:12.526 [pool-2-thread-132][1110] FATAL-ALERT gov.cdc.prime.router.azure.SendFunction - FAILED Sftp upload of inputReportId 64fe583f-6b47-430b-8161-e81a22ff2571 to SFTPTransportType(host=204.58.124.41, port=22, filePath=./FAILED, credentialName=null) (orgService = la-doh.elr-secondary), Exception: Connection timed out (Connection timed out), All retries failed. Manual Intervention Required. Send Error report for: 64fe583f-6b47-430b-8161-e81a22ff2571 to la-doh.elr-secondary[m

What you need to know

To do

oslynn commented 1 year ago

[1;31m22:36:16.222 [pool-2-thread-1138][39571] FATAL-ALERT gov.cdc.prime.router.azure.SendFunction - FAILED POST of inputReportId cc010d1b-cf4b-4d06-b56a-e82c82142a3b to apiUrl=https://prd-v2-onehealthport-api.axwaycloud.com/doh/phchub/PHC-Hub/elr (orgService = wa-phd.rest), Exception: Connect timeout has expired [url=https://prd-v2-onehealthport-api.axwaycloud.com/ohp/oauth/jwt/token, connect_timeout=unknown ms], All retries failed. Manual Intervention Required. Send Error report for: cc010d1b-cf4b-4d06-b56a-e82c82142a3b to wa-phd.rest[m --- REST still checking

[1;31m08:38:10.763 [pool-2-thread-90][707] FATAL-ALERT gov.cdc.prime.router.azure.SendFunction - FAILED POST of inputReportId 546cb634-fb21-45fb-a0e0-fbe00b30370a to apiUrl=https://prd-v2-onehealthport-api.axwaycloud.com/doh/phchub/PHC-Hub/elr (orgService = wa-phd.rest), Exception: Connect timeout has expired [url=https://prd-v2-onehealthport-api.axwaycloud.com/ohp/oauth/jwt/token, connect_timeout=unknown ms], All retries failed. Manual Intervention Required. Send Error report for: 546cb634-fb21-45fb-a0e0-fbe00b30370a to wa-phd.rest[m --- REST

[1;31m05:53:44.312 [pool-2-thread-104][414] FATAL-ALERT gov.cdc.prime.router.azure.SendFunction - FAILED POST of inputReportId 6f715684-16a0-43be-a773-2802b8e2da77 to apiUrl=https://labupload.health.ok.gov/api/document/hl7 (orgService = ok-phd.elr), Exception: Connect timeout has expired [url=https://labupload.health.ok.gov/api/auth/token, connect_timeout=unknown ms], All retries failed. Manual Intervention Required. Send Error report for: 6f715684-16a0-43be-a773-2802b8e2da77 to ok-phd.elr[m -- REST

==== good now ===

[1;31m10:32:38.284 [pool-2-thread-999][55508] FATAL-ALERT gov.cdc.prime.router.azure.SendFunction - FAILED Sftp upload of inputReportId 57deaf41-b847-4958-af68-9838d6fbe6d5 to SFTPTransportType(host=test.moveit.state.co.us, port=22, filePath=./, credentialName=null) (orgService = co-phd.elr-secondary), Exception: Connection timed out (Connection timed out), All retries failed. Manual Intervention Required. Send Error report for: 57deaf41-b847-4958-af68-9838d6fbe6d5 to co-phd.elr-secondary[m -- GOOD

[1;31m02:29:40.963 [pool-2-thread-1305][42378] FATAL-ALERT gov.cdc.prime.router.azure.SendFunction - FAILED Sftp upload of inputReportId d9d5bf17-4860-48ad-86ea-ba3da9122e9a to SFTPTransportType(host=mft.nd.gov, port=22, filePath=/Home/dohdcmsg/nddoh_elr/hl7, credentialName=null) (orgService = nd-doh.elr), Exception: Connection timed out (Connection timed out), All retries failed. Manual Intervention Required. Send Error report for: d9d5bf17-4860-48ad-86ea-ba3da9122e9a to nd-doh.elr[m -- GOOD

[1;31m02:38:12.526 [pool-2-thread-132][1110] FATAL-ALERT gov.cdc.prime.router.azure.SendFunction - FAILED Sftp upload of inputReportId 64fe583f-6b47-430b-8161-e81a22ff2571 to SFTPTransportType(host=204.58.124.41, port=22, filePath=./FAILED, credentialName=null) (orgService = la-doh.elr-secondary), Exception: Connection timed out (Connection timed out), All retries failed. Manual Intervention Required. Send Error report for: 64fe583f-6b47-430b-8161-e81a22ff2571 to la-doh.elr-secondary[m -- GOOD

[1;31m02:29:40.963 [pool-2-thread-1305][42378] FATAL-ALERT gov.cdc.prime.router.azure.SendFunction - FAILED Sftp upload of inputReportId d9d5bf17-4860-48ad-86ea-ba3da9122e9a to SFTPTransportType(host=mft.nd.gov, port=22, filePath=/Home/dohdcmsg/nddoh_elr/hl7, credentialName=null) (orgService = nd-doh.elr), Exception: Connection timed out (Connection timed out), All retries failed. Manual Intervention Required. Send Error report for: d9d5bf17-4860-48ad-86ea-ba3da9122e9a to nd-doh.elr[m --- GOOD

[1;31m02:38:12.526 [pool-2-thread-132][1110] FATAL-ALERT gov.cdc.prime.router.azure.SendFunction - FAILED Sftp upload of inputReportId 64fe583f-6b47-430b-8161-e81a22ff2571 to SFTPTransportType(host=204.58.124.41, port=22, filePath=./FAILED, credentialName=null) (orgService = la-doh.elr-secondary), Exception: Connection timed out (Connection timed out), All retries failed. Manual Intervention Required. Send Error report for: 64fe583f-6b47-430b-8161-e81a22ff2571 to la-doh.elr-secondary[m --- GOOD

oslynn commented 1 year ago

To summary: All SFTPs site are up and good to upload message to them. Except REST API still in question.

sliu1000 commented 1 year ago

Another Production Failure for Colorado:

[1;31m22:32:22.416 [pool-2-thread-285][4649] FATAL-ALERT gov.cdc.prime.router.azure.SendFunction - FAILED Sftp upload of inputReportId a2bd2897-8a1d-4134-a115-e66e7c4a03f8 to SFTPTransportType(host=prod.moveit.state.co.us, port=22, filePath=./, credentialName=null) (orgService = co-phd.elr-hl7), Exception: Connection timed out (Connection timed out), All retries failed. Manual Intervention Required. Send Error report for: a2bd2897-8a1d-4134-a115-e66e7c4a03f8 to co-phd.elr-hl7[m

oslynn commented 1 year ago

The co-phd.elr-hl7 is very unstable on their side. It responds very slowly sometimes. Therefore, we would timeout when try to connect to their site. It is good now. We may have to tweak the connection timing.

oslynn commented 1 year ago

'--- OK-PHD.ELR is good now --- [1;31m05:53:44.312 [pool-2-thread-104][414] FATAL-ALERT gov.cdc.prime.router.azure.SendFunction - FAILED POST of inputReportId 6f715684-16a0-43be-a773-2802b8e2da77 to apiUrl=https://labupload.health.ok.gov/api/document/hl7 (orgService = ok-phd.elr), Exception: Connect timeout has expired [url=https://labupload.health.ok.gov/api/auth/token, connect_timeout=unknown ms], All retries failed. Manual Intervention Required. Send Error report for: 6f715684-16a0-43be-a773-2802b8e2da77 to ok-phd.elr[m

Checked Prod and UAT endpoints: { "result" : "success", "message" : " Receiver Status: ACTIVE\nok-phd.elr: REST Transport\nAttempting to authenticate at: https://labupload.health.ok.gov/api/auth/token\nok-phd.elr: Success: received Authentication header\n ok-phd.elr: OK\n" }

oslynn commented 1 year ago

co-phd.elr-secondary is good to go, now. co-phd.elr-secondary: is a valid receiver Receiver Status: ACTIVE co-phd.elr-secondary: SFTP Transport: SFTPTransportType(host=test.moveit.state.co.us, port=22, filePath=./, credentialName=null) co-phd.elr-secondary: Able to Connect to sftp site co-phd.elr-secondary: Success: ls returned 0 rows of info from SFTP Transport co-phd.elr-secondary: OK