imtbl / hyve

Expose and consume your hydrus media via HTTP API
GNU Affero General Public License v3.0
31 stars 3 forks source link

Sync Issues (.sync-lock becomes permanent) #14

Closed Ryonez closed 4 years ago

Ryonez commented 4 years ago

For a while now, it seems the process for syncing crashes. When this happens however, it results in the service becoming permanently locked, as .sync-lock is never removed.

A quick workaround would be to clear that file when the docker container is restarted. This would mean we could at least restart the container to bring it back into normal operation.

As to the cause, I don't know what that is. But here is the section of logs that I believe is relevant: Pastebin for easier reading.


time="2020-05-31T13:05:36Z" level=info msg="job succeeded" iteration=46 job.command="node /usr/src/app/services/sync-server/bin/sync" job.position=0 job.schedule="0 * * * *"
time="2020-05-31T14:00:00Z" level=info msg=starting iteration=47 job.command="node /usr/src/app/services/sync-server/bin/sync" job.position=0 job.schedule="0 * * * *"
time="2020-05-31T14:00:01Z" level=info msg="6/1/2020, 2:00:01 AM: Running sync…" channel=stdout iteration=47 job.command="node /usr/src/app/services/sync-server/bin/sync" job.position=0 job.schedule="0 * * * *"
time="2020-05-31T14:00:01Z" level=info channel=stdout iteration=47 job.command="node /usr/src/app/services/sync-server/bin/sync" job.position=0 job.schedule="0 * * * *"
time="2020-05-31T14:00:01Z" level=info msg="Create initial tables (if necessary): 0.012s" channel=stdout iteration=47 job.command="node /usr/src/app/services/sync-server/bin/sync" job.position=0 job.schedule="0 * * * *"
time="2020-05-31T14:00:01Z" level=info msg="Drop zombie tables: 0.000s" channel=stdout iteration=47 job.command="node /usr/src/app/services/sync-server/bin/sync" job.position=0 job.schedule="0 * * * *"
time="2020-05-31T14:00:01Z" level=info msg="Attach hydrus databases: 0.031s" channel=stdout iteration=47 job.command="node /usr/src/app/services/sync-server/bin/sync" job.position=0 job.schedule="0 * * * *"
time="2020-05-31T14:00:06Z" level=info msg="Get namespaces: 5.299s" channel=stdout iteration=47 job.command="node /usr/src/app/services/sync-server/bin/sync" job.position=0 job.schedule="0 * * * *"
time="2020-05-31T14:00:06Z" level=info msg="Create new tables: 0.014s" channel=stdout iteration=47 job.command="node /usr/src/app/services/sync-server/bin/sync" job.position=0 job.schedule="0 * * * *"
time="2020-05-31T14:00:06Z" level=info msg="Fill new namespaces table: 0.139s" channel=stdout iteration=47 job.command="node /usr/src/app/services/sync-server/bin/sync" job.position=0 job.schedule="0 * * * *"
time="2020-05-31T14:00:13Z" level=info msg="Fill new tags table: 6.255s" channel=stdout iteration=47 job.command="node /usr/src/app/services/sync-server/bin/sync" job.position=0 job.schedule="0 * * * *"
time="2020-05-31T14:01:21Z" level=info msg="received terminated, shutting down"
time="2020-05-31T14:01:21Z" level=info msg="waiting for jobs to finish"
Waiting 60 seconds before running initial sync…
Another sync seems to be running already, aborting. If you are certain that this is not the case, delete /data/hyve/.sync-lock and try again.
time="2020-05-31T15:03:28Z" level=info msg="read crontab: .crontab"
time="2020-05-31T16:00:00Z" level=info msg=starting iteration=0 job.command="node /usr/src/app/services/sync-server/bin/sync" job.position=0 job.schedule="0 * * * *"
time="2020-05-31T16:00:00Z" level=info msg="Another sync seems to be running already, aborting. If you are certain that this is not the case, delete /data/hyve/.sync-lock and try again." channel=stdout iteration=0 job.command="node /usr/src/app/services/sync-server/bin/sync" job.position=0 job.schedule="0 * * * *"
time="2020-05-31T16:00:00Z" level=info msg="job succeeded" iteration=0 job.command="node /usr/src/app/services/sync-server/bin/sync" job.position=0 job.schedule="0 * * * *"
time="2020-05-31T17:00:00Z" level=info msg=starting iteration=1 job.command="node /usr/src/app/services/sync-server/bin/sync" job.position=0 job.schedule="0 * * * *"
time="2020-05-31T17:00:00Z" level=info msg="Another sync seems to be running already, aborting. If you are certain that this is not the case, delete /data/hyve/.sync-lock and try again." channel=stdout iteration=1 job.command="node /usr/src/app/services/sync-server/bin/sync" job.position=0 job.schedule="0 * * * *"```
imtbl commented 4 years ago

It looks like it's not correctly deleting the .sync-lock when the container is shut down. This should normally be the case, but I might have missed something.

As a general rule, I agree that it would be best to just remove any existing .sync-lock when starting the container as a cleanup measure and will add this.

I had actually already considered this in the past but didn't end up implementing it because I thought there might be some use cases where the same data is shared between a hyve instance outside of Docker and in Docker (in which case starting the Docker container would eventually cause both syncs to error if it removed the lockfile). But I guess that shouldn't happen very often (if at all) and I'll just put a note about it in the readme.

On a sidenote, your log doesn't show any errors/crashes, this looks like a normal shutdown:

time="2020-05-31T14:01:21Z" level=info msg="received terminated, shutting down"

Ryonez commented 4 years ago

Makes sense. That time is the time the server brings down containers for backups. Didn't realize that when I looked as it's in UTC (My containers get the correct local passed, this just didn't use it).

Thank you for the changes, I'll let you know if the issue occurs outside of a restart.