disk91 / watchium-issue

3 stars 0 forks source link

20220505 - ETL Delayed #20

Closed Hpar33 closed 2 years ago

Hpar33 commented 2 years ago

Depuis env 24h , les données ne sont plus importées et la mise à jour des HS ne se fait donc pas .

disk91 commented 2 years ago

We have 2 ETL server and 2 different upgrade + 2 different ETL issues 1) Helium required to upgrade to version 142 in a short period of time. this upgrade is including a migration script supposed to be 8minutes on the doc. But the reality for all of us is thta after 36h this migration is not terminated. This migration has crashed the secondary ETL we use and we not control. Fix in progress but long. We finally cancel it on primary ETL but it is slowing down the synchrnoization. 2) Our ETL server is closed to run out of disk space (ETL database is 5TB database size now + 0.5TB BC size) we are 96% full. The upgrade is in progress an we need to sync the data from old to new storage. This is also impacting the sync time and make data delayed. (The secondary ETL was supposed to hide this step, the 96% - +1% per day force us to make it) 3) Yesterday we had really big blocks loaded (most of hotspot has crashed due to this, out ETL run a block over 1h yesterday in the upgrade context) 4) Yesterday 4pm we stopped the upgrade script to get back on sync (we were 20block back around 9pm), restart it to progress in update and stop upgrade at 10pm (we were 50 block back). 5) Yesterday 10:46pm the ETL has stopped to integrate new blocks even if the process was still running this morning at 8am. this is apparently a bug in the ETL

The ETL and database has been restarted this morning at 8:30am and now sync at a nominal speed 1 block on every 25s average. The ETL is 500 block late currently and should be back in sync in the day.

The situation will not be stable until secondary ETL back online (no estimate date, not under our control) and the migration to new storage terminated, we do our best to limit the impact.

disk91 commented 2 years ago

Current status at 2pm CEST primary ETL

Secondary ETL

disk91 commented 2 years ago

Status at 3:30pm CEST primary ETL

secondary ETL

Watchium will be back on primary as soon as primary will be close to sync

disk91 commented 2 years ago

Status at 10pm CEST primary ETL

secondary ETL

watchium

disk91 commented 2 years ago

Status May 6th at 8am CEST primary ETL

secondary ETL

watchium

disk91 commented 2 years ago

Status May 6th at 12pm CEST primary ETL

secondary ETL

watchium

disk91 commented 2 years ago

Status May 6th at 10pm CEST primary ETL

secondary ETL

watchium

disk91 commented 2 years ago

Status May 7th at 11am CEST primary ETL

secondary ETL

watchium