uselagoon / lagoon-sync

Apache License 2.0
6 stars 5 forks source link

check if sync is already running and alert or prevent additional sync is running #97

Open Schnitzel opened 12 months ago

Schnitzel commented 12 months ago

We currently have a customer that we believe lagoon-sync is causing problems with the database cluster.

We believe this is what is happening:

  1. Customer does an mysql sync via lagoon-sync from their local, the dump is created uploaded to the remote environment and an import starts
  2. For some reason the connection breaks between local and remote <-- more about this below, the actual mysql import is still running though in the remote environment
  3. The customer starts another mysql sync, lagoon-sync does everything a second time and we end up with two mysql imports running in the remote environment at the same time

The issue this is causing is that mysql uses locks during the import and so now basically there is a race condition of two imports wanting to import the exact same tables and data and causing all kind of issues on the cluster.

Also we don't really know why the connection breaks, could be that the internet of the developer is bad or maybe there is an SSH timeout? Anyway I think it doesn't really matter why it breaks, we need to assume it can break any time.

Our idea is now that we should maybe implement a system were lagoon-sync realizes that there is already an sync running in the background from another lagoon-sync instance and then stops and asks the user if:

Based on the files that lagoon-sync creates we could just look at the process list and see if there is for example a process running with lagoon_sync in it, a short check of running processes showed that processes look like this:

mysql -h$MARIADB_HOST -u$MARIADB_USERNAME -p$MARIADB_PASSWORD -P$MARIADB_PORT $MARIADB_DATABASE < /tmp/lagoon_sync_mariadb_1695203108414956999.sql

Or maybe we also want to create a pid file somewere which can be checked by lagoon-sync so that we know which pid to actually kill. Or maybe something completely different?

bomoko commented 11 months ago

I think I may have seen this kind of thing before - I'm pretty sure that it's the ssh connection dropping.

bomoko commented 11 months ago

First step in this is done - we've moved away from execing ssh to actually using golang's ssh lib (#105) - this should give us better error handling/control to pick up when something has broken.