anhnongdan / bimax_ha_migration

all db backup, recovery and migration scenarios for bimax
0 stars 0 forks source link

[New] Monitor Server behavior when cron_backup is running #18

Open anhnongdan opened 6 years ago

anhnongdan commented 6 years ago

Continue #9 System is running in a stable state without backup.

anhnongdan commented 6 years ago

For PW1 VTV with heaviest traffic of all.

Run start 13:00

=> Web UI 504 - then 404 => DB No request imported to tracker. => Log scanning runs very slow.

Resources are all OK

screen shot 2017-10-06 at 1 33 23 pm
anhnongdan commented 6 years ago

Backup: 13:00 ~ 2017-10-06 13:46:37: backup success for 1 After backing up, pw1 comes backon.

screen shot 2017-10-06 at 2 02 25 pm screen shot 2017-10-06 at 2 14 37 pm screen shot 2017-10-06 at 2 14 43 pm screen shot 2017-10-06 at 2 14 59 pm screen shot 2017-10-06 at 2 15 19 pm screen shot 2017-10-06 at 2 16 19 pm
anhnongdan commented 6 years ago

1. Between 13:20 and 13:40: xtrabackup run binary log scanning

  1. Bin log flush causes DB deadlock. Constantly from 13:18 to 13:40
    
    Fri Oct  6 13:18:02 ICT 2017 Total wait time is high 7123 - avg. 100
    13356362        admin   localhost:35966 NULL    Query   116     Waiting for table flush FLUSH NO_WRITE_TO_BINLOG TABLES 0.000
    13370790        admin   localhost:55594 pw1     Query   115     Waiting for table flush SELECT visit_last_action_time, visit_first_action_time, idvisitor, idvisit, user_id, visit_exit_idac    0.000

Fri Oct 6 13:40:02 ICT 2017 Total wait time is high 106035 - avg. 1413 13356362 admin localhost:35966 NULL Query 1436 Waiting for table flush FLUSH NO_WRITE_TO_BINLOG TABLES 0.000 13370790 admin localhost:55594 pw1 Query 1435 Waiting for table flush SELECT visit_last_action_time, visit_first_action_time, idvisitor, idvisit, user_id, visit_exit_idac 0.000

anhnongdan commented 6 years ago

That can be the clue.

??? Do I need to stop the DB first ???

anhnongdan commented 6 years ago

Verified on low traffic instance: THVL and K+: WebUI works while backup is running.