Repeat crashing after "directory not empty" error

snarlysodboxer commented 2 years ago

We're running about 10 connect servers, and every few days they crash and restarting them fails with errors pertaining to files left on the filesystem. We're running in Kubernetes so it goes into crashLoopBackoff until we go manually delete the pod, at which point the files are cleared and it works again for a few days. Here's an example log:

onepassword-connect-obscured-id-1 connect-sync {"log_message":"(E) Server: (txFunc returned error), deleting obsoleted file : remove /home/opuser/.op/data/files: directory not empty","timestamp":"2021-11-30T20:46:20.902654958Z","level":1}
onepassword-connect-obscured-id-1 connect-sync {"log_message":"(E) Server: (failed to vaultGrp.Wait), Wrapped: (failed to db.Do), Wrapped: (txFunc returned error), deleting obsoleted file : remove /home/opuser/.op/data/files: directory not empty","timestamp":"2021-11-30T20:46:20.902763096Z","level":1}
onepassword-connect-obscured-id-1 connect-sync {"log_message":"(E) Server: (txFunc returned error), deleting obsoleted file : remove /home/opuser/.op/data/files: directory not empty","timestamp":"2021-11-30T20:46:21.0361941Z","level":1}
onepassword-connect-obscured-id-1 connect-sync {"log_message":"(E) Server: (failed to vaultGrp.Wait), Wrapped: (failed to db.Do), Wrapped: (txFunc returned error), deleting obsoleted file : remove /home/opuser/.op/data/files: directory not empty","timestamp":"2021-11-30T20:46:21.03626576Z","level":1}
onepassword-connect-obscured-id-1 connect-sync {"log_message":"(E) failed to sync after re-authenticating, will await next notification","timestamp":"2021-11-30T20:46:21.036295745Z","level":1}

jpcoenen commented 2 years ago

Thank you for reporting this. I am pretty confident we have reproduced the problem and we are working on a fix. I'll let you know once it has been released.

jpcoenen commented 2 years ago

Hey @snarlysodboxer! We have just release v1.5.4 of Connect, which contains a fix for a similar issue. Could you let me know if that resolves your problem?

snarlysodboxer commented 2 years ago

Hi @jpcoenen, looks like we'll also need to upgrade the connect-go-sdk version to be able to test against this new version. Hope to do that soon, I'll update.

jpcoenen commented 2 years ago

That should not have to be necessary. We generally try to keep Connect compatible with older versions of the SDK's. So if an older version of the SDK does not work with this version of Connect, we'd really like to know. Could you tell me what issue you are running into and with which version of the SDK that is?

snarlysodboxer commented 2 years ago

@jpcoenen it's version 1.4.0 of the sdk that had an error message, something about "need version 1.3.0 or higher, detected 1.2.0." - That's not verbatim, I don't have the error message in front of me at this time.

EDIT: updated the version of the SDK producing the error.

snarlysodboxer commented 2 years ago

@jpcoenen Here's the exact error: "need at least version 1.3.0 of Connect for this function, detected version 1.2.0 (or earlier). Please update your Connect server". This is happening even with the updated 1.4.0 version of the connect-sdk-go client, and with the 1password/connect-api:1.5.4 and 1password/connect-sync:1.5.4 docker images.

snarlysodboxer commented 2 years ago

@jpcoenen I believe I've nailed down the exact issue, and since it's not related to this issue, I submitted it as a new one: https://github.com/1Password/connect/issues/40

snarlysodboxer commented 2 years ago

Still haven't been able to give this a good test, as there's now a third bug delineated here: https://github.com/1Password/connect/issues/42. (First two fixed though!)

snarlysodboxer commented 2 years ago

This appears to be fixed with version 1.5.6. Thanks!

1Password / connect

Repeat crashing after "directory not empty" error #36