Closed sggerard closed 3 months ago
The server that hosted the NRIS-EPD server (cocacola server) was down since 25-05-24. As of 30-05-24, the server seems to be back up and the import is running, though a status 'Failed' is still appearing and needs investigating. Could be related to the fact that this import action seems to be taking a very long time. (Each individual record seems to be taking 7-8 seconds to process).
For the EMLI import, during the period that the import was stopping with a timeout error ("General Error Error: timeout of 2000ms exceeded"), a manual query to the API returned the following message:
As of 29-06-24, the error had changed to "General Error Error: read ECONNRESET". Given that the errors around this import seemed to be mirrored by the EPD import, it is likely that this service was also running on Cocacola.
As of 30-06-24, both importers seem to be running, but both report a status of Failed
. And while the EPD importer is still reporting thousands of records imported, the EMLI one is only reporting 2. Further investigation is needed to see why the Failed
error is appearing.
Despite some of the import functionality returning, this page does not have the cocacola server issue showing resolved. Further effort into resolving these issues should not be made until all issues with the NRISWS server have been resolved.
Updates to the outage page show that some of the affected apps have been restored, including NRIS Web Service (this was likely true yesterday, and i just missed it). As of now, the EPD importer is running and still processing records very slowly. I think it appropriate to still wait for all issues with the cocacola server to be resolved before further investigation.
The integration needs to be looked into further before fixing. The move from the Cocacola server was unplanned failure. The new infrastructure is causing issues for us to work. Need to look into limitation and how to resolve the issue with this server.
Will need to reconfigure this ticket or rewrite.
Describe the Bug NRIS-EMLI & NRIS-EPD cron imports appear to be intermittently failing in production to due to timeout errors. "General Error Error: timeout of 2000ms exceeded". Unsure as to the cause of the error, investigation required.
Expected Behaviour Importer completes successfully
Actual Behaviour Fails due to unknown error
Implications Records from EMLI and EPD are not being updated into NRPTI.