CollaboraOnline / online

Collabora Online is a collaborative online office suite based on LibreOffice technology. This is also the source for the Collabora Office apps for iOS and Android.
https://collaboraonline.com
Other
1.85k stars 701 forks source link

Endless retry to refresh lock #5870

Closed juliusknorr closed 1 year ago

juliusknorr commented 1 year ago

We just had a heavy request spam from our Collabora to the Nextcloud server which seemed to have been caused by attempts to refresh the lock while the WOPI token was no longer valid.

Feb 17 11:41:07 office1 coolwsd[675]: wsd-00675-606110 2023-02-17 11:41:07.116094 +0000 [ docbroker_09e ] ERR  Un-successful WOPI::Lock with status 403 and response: []| wsd/Storage.cpp:984
Feb 17 11:41:07 office1 coolwsd[675]: wsd-00675-606110 2023-02-17 11:41:07.116410 +0000 [ docbroker_09e ] ERR  Failed to refresh lock| wsd/DocumentBroker.cpp:1877
Feb 17 11:41:07 office1 coolwsd[675]: wsd-00675-606110 2023-02-17 11:41:07.175313 +0000 [ docbroker_09e ] ERR  Un-successful WOPI::Lock with status 403 and response: []| wsd/Storage.cpp:984
Feb 17 11:41:07 office1 coolwsd[675]: wsd-00675-606110 2023-02-17 11:41:07.175671 +0000 [ docbroker_09e ] ERR  Failed to refresh lock| wsd/DocumentBroker.cpp:1877

The Nextcloud server responded with a 403 but it seems that Collabora just keeps retrying to refresh the lock.

From a quick look at the DocumentBroker::refreshLock it seems that the error there is logged but not further passed, so DocumentBroker::pollThread would just continue and on the next iteration trigger the lock refresh again.

https://github.com/CollaboraOnline/online/blob/1da8d6dd24bf747295c387aa1ca49f1ac18260d4/wsd/DocumentBroker.cpp#L351-L352

https://github.com/CollaboraOnline/online/blob/1da8d6dd24bf747295c387aa1ca49f1ac18260d4/wsd/DocumentBroker.cpp#L1811-L1815

Version in use:

COOLWSD version:
21.11.8.3 git hash: 914bfd2
LOKit version:
Collabora Office 21.06.35.2 git hash: 76e5d30
Served by: Ubuntu 22.04.1 LTS 7220169b

Copyright © 2022, Collabora Productivity Limited.

Ashod commented 1 year ago

Thanks for reporting this, @juliushaertl.

Is it safe to assume the 403 return means the token in question is expired?

We do not use expired tokens to refresh the lock or upload the document. We interpret 401, 403, and 404 from PutFile to mean the token is expired, and we stop using it from then on.

It seems the issue here is that we never expected/supported getting 401/3/4 from a lock request. If you confirm my assumption, then it's straightforward fix.

Thanks.

juliusknorr commented 1 year ago

Is it safe to assume the 403 return means the token in question is expired?

Currently we return 403 but looking at the WOPI spec it should rather be 401 from https://learn.microsoft.com/en-us/microsoft-365/cloud-storage-partner-program/rest/files/refreshlock, so best might be to handle both.

We do not use expired tokens to refresh the lock or upload the document. We interpret 401, 403, and 404 from PutFile to mean the token is expired, and we stop using it from then on.

I would assume it may have happened that on put file the token was still valid but once the RefreshLock request was made the token was then expired.

From Nextclouds perspective the 403/401 could happen on any WOPI request as authentication with the access token is handled globally by a middleware that would just always return those status codes if the token is invalid/expired.

Ashod commented 1 year ago

I would assume it may have happened that on put file the token was still valid but once the RefreshLock request was made the token was then expired.

Right, that's my assumption as well.

From Nextclouds perspective the 403/401 could happen on any WOPI request as authentication with the access token is handled globally by a middleware that would just always return those status codes if the token is invalid/expired.

We now handle 401/403/404 to mean 'token expired'.

Thank you.

juliusknorr commented 1 year ago

Thanks as well for the quick fix.