9.1 stable Federated share not longer available (storage not available)

mmattel commented 7 years ago

### Steps to reproduce No clue what happend, but from one minute to another my federated share (from external) was no longer accessible. I identified it by chance when I tried to acess it via the browser. Also a reboot did not help. The interresing thing is the GuzzleHttp error message (for details see log below). cURL error 6: Could not resolve host: `subdomin.domain.comocs` pls see the `ocs` at the end. Looking into table oc_share_external you have the right url `https://subdomin.domain.com/` present. After removing the federated share and accepting a new invite, the share is available again. ### Expected behaviour Federated share stays accessible ### Actual behaviour When accessing the federated share via the browser, a "Storage not available" message appears. The sync client (2.2.3) does not tell about a problem, shows full used capacity. It looks like things get synced (green sync symbol after a while) but does not sync it in reality. After pausing sync and reenabling it, the sync symbols do not appear. ### Server configuration **Operating system**: Ubuntu 14.04 **Web server:** nginx **Database:** mysql **PHP version:** 5.5.9 **ownCloud version:** (see ownCloud admin page) 9.1.0 stable **Updated from an older ownCloud or fresh install:** fresh **Where did you install ownCloud from:** tar **Signing status (ownCloud 9.0 and above):** no errors logged **List of activated apps:** ``` Enabled: - activity: 2.3.2 - comments: 0.3.0 - dav: 0.2.5 - federatedfilesharing: 0.3.0 - federation: 0.1.0 - files: 1.5.1 - files_external: 0.6.0 - files_pdfviewer: 0.8.1 - files_sharing: 0.10.0 - files_texteditor: 2.1 - files_trashbin: 0.9.0 - files_versions: 1.3.0 - files_videoplayer: 0.9.8 - firstrunwizard: 1.1 - gallery: 15.0.0 - logreader: 1.1.5 - notifications: 0.3.0 - provisioning_api: 0.5.0 - systemtags: 0.3.0 - templateeditor: 0.1 - updatenotification: 0.2.1 Disabled: - encryption - external - files_antivirus - user_external - user_ldap ``` **The content of config/config.php:** ``` { "system": { "instanceid": "ock7fv8dydb8", "passwordsalt": "***REMOVED SENSITIVE VALUE***", "secret": "***REMOVED SENSITIVE VALUE***", "trusted_domains": [ "xxx" ], "datadirectory": "\/var\/www\/owncloud\/data", "overwrite.cli.url": "https:\/\/xx", "dbtype": "mysql", "version": "9.1.0.15", "dbname": "owncloud", "dbhost": "localhost", "dbtableprefix": "oc_", "dbuser": "***REMOVED SENSITIVE VALUE***", "dbpassword": "***REMOVED SENSITIVE VALUE***", "logtimezone": "Europe\/Vienna", "installed": true, "filelocking.enabled": "true", "memcache.local": "\\OC\\Memcache\\Redis", "memcache.locking": "\\OC\\Memcache\\Redis", "memcache.distributed": "\\OC\\Memcache\\Redis", "redis": { "host": "localhost", "port": 6379, "timeout": 0, "dbindex": 0 }, "asset-pipeline.enabled": false, "filesystem_check_changes": 1, "preview_max_scale_factor": 1, "enabledPreviewProviders": [ "OC\\Preview\\PNG", "OC\\Preview\\JPEG", "OC\\Preview\\GIF", "OC\\Preview\\Illustrator", "OC\\Preview\\Postscript", "OC\\Preview\\Photoshop", "OC\\Preview\\TIFF", "OC\\Preview\\Raw" ], "mail_from_address": "xx", "mail_smtpmode": "smtp", "mail_domain": "xx", "mail_smtpsecure": "tls", "mail_smtpauthtype": "LOGIN", "mail_smtphost": "xxx", "mail_smtpport": "587", "loglevel": 2, "maintenance": false } } ``` **Are you using external storage, if yes which one:** local/smb/sftp/... yes, smb, federation **Are you using encryption:** yes/no no **Are you using an external user-backend, if yes which one:** LDAP/ActiveDirectory/Webdav/... no ### Client configuration **Browser:** Opera **Operating system:** W7x64 ### Logs #### Web server error log ``` No corresponding log entries ``` #### ownCloud log (data/owncloud.log) ``` {"reqId":"xmrqv3i5uPskq\/eq3wiT","remoteAddr":"ww.xx.yy.zz","app":"federation","message":"Exception: {\"Exception\":\"GuzzleHttp\\\\Exception\\\\ConnectException\",\"Message\":\"cURL error 6: Could not resolve host: subdomin.domain.comocs\",\"Code\":0,\"Trace\":\" #0 \\\/var\\\/www\\\/owncloud\\\/3rdparty\\\/guzzlehttp\\\/guzzle\\\/src\\\/RequestFsm.php(104): GuzzleHttp\\\\Exception\\\\RequestException::wrapException(Object(GuzzleHttp\\\\Message\\\\Request), Object(GuzzleHttp\\\\Ring\\\\Exception\\\\ConnectException))\\ n#1 \\\/var\\\/www\\\/owncloud\\\/3rdparty\\\/guzzlehttp\\\/guzzle\\\/src\\\/RequestFsm.php(132): GuzzleHttp\\\\RequestFsm->__invoke(Object(GuzzleHttp\\\\Transaction))\\ n#2 \\\/var\\\/www\\\/owncloud\\\/3rdparty\\\/react\\\/promise\\\/src\\\/FulfilledPromise.php(25): GuzzleHttp\\\\RequestFsm->GuzzleHttp\\\\{closure}(Array)\\ n#3 \\\/var\\\/www\\\/owncloud\\\/3rdparty\\\/guzzlehttp\\\/ringphp\\\/src\\\/Future\\\/CompletedFutureValue.php(55): React\\\\Promise\\\\FulfilledPromise->then(Object(Closure), NULL, NULL)\\ n#4 \\\/var\\\/www\\\/owncloud\\\/3rdparty\\\/guzzlehttp\\\/guzzle\\\/src\\\/Message\\\/FutureResponse.php(43): GuzzleHttp\\\\Ring\\\\Future\\\\CompletedFutureValue->then(Object(Closure), NULL, NULL)\\ n#5 \\\/var\\\/www\\\/owncloud\\\/3rdparty\\\/guzzlehttp\\\/guzzle\\\/src\\\/RequestFsm.php(135): GuzzleHttp\\\\Message\\\\FutureResponse::proxy(Object(GuzzleHttp\\\\Ring\\\\Future\\\\CompletedFutureArray), Object(Closure))\\ n#6 \\\/var\\\/www\\\/owncloud\\\/3rdparty\\\/guzzlehttp\\\/guzzle\\\/src\\\/RequestFsm.php(132): GuzzleHttp\\\\RequestFsm->__invoke(Object(GuzzleHttp\\\\Transaction))\\ n#7 \\\/var\\\/www\\\/owncloud\\\/3rdparty\\\/react\\\/promise\\\/src\\\/FulfilledPromise.php(25): GuzzleHttp\\\\RequestFsm->GuzzleHttp\\\\{closure}(Array)\\ n#8 \\\/var\\\/www\\\/owncloud\\\/3rdparty\\\/guzzlehttp\\\/ringphp\\\/src\\\/Future\\\/CompletedFutureValue.php(55): React\\\\Promise\\\\FulfilledPromise->then(Object(Closure), NULL, NULL)\\ n#9 \\\/var\\\/www\\\/owncloud\\\/3rdparty\\\/guzzlehttp\\\/guzzle\\\/src\\\/Message\\\/FutureResponse.php(43): GuzzleHttp\\\\Ring\\\\Future\\\\CompletedFutureValue->then(Object(Closure), NULL, NULL)\\ n#10 \\\/var\\\/www\\\/owncloud\\\/3rdparty\\\/guzzlehttp\\\/guzzle\\\/src\\\/RequestFsm.php(135): GuzzleHttp\\\\Message\\\\FutureResponse::proxy(Object(GuzzleHttp\\\\Ring\\\\Future\\\\CompletedFutureArray), Object(Closure))\\ n#11 \\\/var\\\/www\\\/owncloud\\\/3rdparty\\\/guzzlehttp\\\/guzzle\\\/src\\\/Client.php(165): GuzzleHttp\\\\RequestFsm->__invoke(Object(GuzzleHttp\\\\Transaction))\\ n#12 \\\/var\\\/www\\\/owncloud\\\/3rdparty\\\/guzzlehttp\\\/guzzle\\\/src\\\/Client.php(125): GuzzleHttp\\\\Client->send(Object(GuzzleHttp\\\\Message\\\\Request))\\ n#13 \\\/var\\\/www\\\/owncloud\\\/lib\\\/private\\\/Http\\\/Client\\\/Client.php(136): GuzzleHttp\\\\Client->get('http:\\\/\\\/owncloud...', Array)\\ n#14 \\\/var\\\/www\\\/owncloud\\\/apps\\\/federation\\\/lib\\\/BackgroundJob\\\/GetSharedSecret.php(156): OC\\\\Http\\\\Client\\\\Client->get('http:\\\/\\\/owncloud...', Array)\\ n#15 \\\/var\\\/www\\\/owncloud\\\/lib\\\/private\\\/BackgroundJob\\\/Job.php(52): OCA\\\\Federation\\\\BackgroundJob\\\\GetSharedSecret->run(Array)\\ n#16 \\\/var\\\/www\\\/owncloud\\\/apps\\\/federation\\\/lib\\\/BackgroundJob\\\/GetSharedSecret.php(134): OC\\\\BackgroundJob\\\\Job->execute(Object(OC\\\\BackgroundJob\\\\JobList), Object(OC\\\\Log))\\ n#17 \\\/var\\\/www\\\/owncloud\\\/apps\\\/federation\\\/lib\\\/BackgroundJob\\\/GetSharedSecret.php(119): OCA\\\\Federation\\\\BackgroundJob\\\\GetSharedSecret->parentExecute(Object(OC\\\\BackgroundJob\\\\JobList),Object(OC\\\\Log))\\ n#18 \\\/var\\\/www\\\/owncloud\\\/cron.php(144): OCA\\\\Federation\\\\BackgroundJob\\\\GetSharedSecret->execute(Object(OC\\\\BackgroundJob\\\\JobList), Object(OC\\\\Log))\\ n#19 {main}\",\"File\":\"\\\/var\\\/www\\\/owncloud\\\/3rdparty\\\/guzzlehttp\\\/guzzle\\\/src\\\/Exception\\\/RequestException.php\",\"Line\":49}","level":3,"time":"2016-08-18T15:26:44+02:00","method":"GET","url":"\/cron.php","user":"--"} ```

PVince81 commented 7 years ago

@DeepDiver1975 more discovermanager woes ?

PVince81 commented 7 years ago

Missing trailing slash apparently.

Setting to critical as it might break existing federated shares. @DeepDiver1975 can you have a look ?

DeepDiver1975 commented 7 years ago

Looking into table oc_share_external you have the right url https://subdomin.domain.com/ present.

@mmattel can you please have a look into the table oc_trusted_servers? THX

mmattel commented 7 years ago

Due to urgent business needs we removed yesterday the old federation share and re-invited / added the new one. oc_trusted_servers contains the correct address (at least with the re-added federation share). I will monitor this table in future too...

cURL error 6: Could not resolve host: subdomin.domain.comocs pls see the ocs at the end.

I have no clue where this bogus ocs came from

PVince81 commented 7 years ago

@DeepDiver1975 we should probably add some ltrim in the code just to be sure. There are too many places in the code that just assume slash existence.

DeepDiver1975 commented 7 years ago

I have no clue where this bogus ocs came from

this is from the url which is being called

    private $endPoint = '/ocs/v2.php/apps/federation/api/v1/shared-secret?format=json';

PVince81 commented 7 years ago

@mmattel does the real name of "subdomin.domain.com" also contain the string "ocs" in its legitimate part ? Just asking because at some point we had a weird routing issue when people deployed their ownCloud to "/ocs"

DeepDiver1975 commented 7 years ago

Due to urgent business needs we removed yesterday the old federation share and re-invited / added the new one. oc_trusted_servers contains the correct address (at least with the re-added federation share). I will monitor this table in future too...

can you recreate the table's content from a backup?

mmattel commented 7 years ago

@DeepDiver1975 sh.. there is no backup. I do not even have a snapshot of the filesystem where I could recover the database and take a look into (forgot to reenable snapshot scheduling). The incident appeared, I wrote the issue and we needed to continue business...

@PVince81 No, I can assure you that ocs is in no part of "subdomin.domain.com" We also use "subdomin.domain.com" and not "domain.com/instance" (I can not provide you domain or user name or IP address details for security reasons...)

mmattel commented 7 years ago

@DeepDiver1975 now I remember why I do not have snapshots enabled anymore. Because those create in each folder a .snapshot directory. As my DB and instance are on nfs and owncloud can not exclude defined folders from beein processed, those would immediately beeing scanned and processed... For you sure remember my attempt of excluding defined directory names :weary:

DeepDiver1975 commented 7 years ago

For you sure remember my attempt of excluding defined directory names :weary:

yes I do

mmattel commented 7 years ago

@DeepDiver1975 some good news, not going into the details how, but I there was a snapshot created I was able to access (but not visible...). I copied the database status from a night ago (not this night) and remounted it, pls see the result:

The url´s are fine and identical, but the token is missing in old... I can not say when the incident exactly occured, but it must have been between day -2 (where it worked) and day -1, as day 0 was when I recognized the incident

old: day -1 snapshot at 0h

new: now

mmattel commented 7 years ago

Update, looking into the owncloud log just now, I saw the same error occured 6h ago again. This means this new error occured AFTER the old fed share was removed and the new onw was accepted AND now. Testing acccess now, it worked...

DeepDiver1975 commented 7 years ago

so the token was removed .....

PVince81 commented 7 years ago

There must be a token-eating monster in the code then...

mmattel commented 7 years ago

Bad news: I have been on a business trip for some days. Coming back and logging in via browser, trying to access the fed_share I suddenly get Storage not available, happens also when accessing via a webdav client. The very strange thing is, that it had worked after removing the fed_share and readding it, see above...

To be noted, the fed_share_name looks like Owncloud Fedshare means a blank in the name. Wasn´t there a possible issue recently ? oc_trusted_servers looks fine.

So the thing is, before we do get a curl error, now a http 503 error but in all cases it is related to fed_shares.

If I do on my and on the fed_share providing system a browser call: https://subdomain.domain.com/status.php I get proper responses and both systems run identical versions. {"installed":true,"maintenance":false,"version":"9.1.0.15","versionstring":"9.1.0","edition":""}

I also cleared the browser cache and rebooted the notebook without any positive effect.

Here is the log that is generated when accessing the fed_share.

Exception: {"Message":"HTTP\/1.1 503 Storage not available","Exception":"Sabre\\DAV\\Exception\\ServiceUnavailable","Code":0,"Trace":"
#0 \/var\/www\/owncloud\/3rdparty\/sabre\/dav\/lib\/DAV\/Server.php(903): OCA\\DAV\\Connector\\Sabre\\ObjectTree->getNodeForPath('fed_share_name...')\
n#1 \/var\/www\/owncloud\/3rdparty\/sabre\/dav\/lib\/DAV\/CorePlugin.php(334): Sabre\\DAV\\Server->getPropertiesForPath(' fed_share_name...', Array, 1)\
n#2 [internal function]: Sabre\\DAV\\CorePlugin->httpPropFind(Object(Sabre\\HTTP\\Request), Object(Sabre\\HTTP\\Response))\
n#3 \/var\/www\/owncloud\/3rdparty\/sabre\/event\/lib\/EventEmitterTrait.php(105): call_user_func_array(Array, Array)\
n#4 \/var\/www\/owncloud\/3rdparty\/sabre\/dav\/lib\/DAV\/Server.php(459): Sabre\\Event\\EventEmitter->emit('method:PROPFIND', Array)\
n#5 \/var\/www\/owncloud\/3rdparty\/sabre\/dav\/lib\/DAV\/Server.php(248): Sabre\\DAV\\Server->invokeMethod(Object(Sabre\\HTTP\\Request), Object(Sabre\\HTTP\\Response))\
n#6 \/var\/www\/owncloud\/apps\/dav\/appinfo\/v1\/webdav.php(56): Sabre\\DAV\\Server->exec()\
n#7 \/var\/www\/owncloud\/remote.php(164): require_once('\/var\/w...')\
n#8 {main}","File":"\/var\/www\/owncloud\/apps\/dav\/lib\/Connector\/Sabre\/ObjectTree.php","Line":159,"User":"user_x"}

mmattel commented 7 years ago

oc_trusted_servers I am wondering, compared to the other two screenshots of this table above, why are there changes like in sync token populated with an url which has not been there before and sync status is now 1 when it was 2 before. Pls remember that the second screenshot above was made directly after the readding and successfully accessing the fed_share. Why are these things changing and could there be a relationship causing the issue? NOTHING has been changed in the setup of my instance...

mmattel commented 7 years ago

Just waiting and watching the logs...

Error federation
Exception: {"Exception":"GuzzleHttp\\Exception\\ConnectException","Message":"cURL error 28: Resolving timed out after 3515 milliseconds","Code":0,"Trace":"
#0 \/var\/www\/owncloud\/3rdparty\/guzzlehttp\/guzzle\/src\/RequestFsm.php(104): GuzzleHttp\\Exception\\RequestException::wrapException(Object(GuzzleHttp\\Message\\Request), Object(GuzzleHttp\\Ring\\Exception\\ConnectException))\
n#1 \/var\/www\/owncloud\/3rdparty\/guzzlehttp\/guzzle\/src\/RequestFsm.php(132): GuzzleHttp\\RequestFsm->__invoke(Object(GuzzleHttp\\Transaction))\
n#2 \/var\/www\/owncloud\/3rdparty\/react\/promise\/src\/FulfilledPromise.php(25): GuzzleHttp\\RequestFsm->GuzzleHttp\\{closure}(Array)\
n#3 \/var\/www\/owncloud\/3rdparty\/guzzlehttp\/ringphp\/src\/Future\/CompletedFutureValue.php(55): React\\Promise\\FulfilledPromise->then(Object(Closure), NULL, NULL)\
n#4 \/var\/www\/owncloud\/3rdparty\/guzzlehttp\/guzzle\/src\/Message\/FutureResponse.php(43): GuzzleHttp\\Ring\\Future\\CompletedFutureValue->then(Object(Closure), NULL, NULL)\
n#5 \/var\/www\/owncloud\/3rdparty\/guzzlehttp\/guzzle\/src\/RequestFsm.php(135): GuzzleHttp\\Message\\FutureResponse::proxy(Object(GuzzleHttp\\Ring\\Future\\CompletedFutureArray), Object(Closure))\
n#6 \/var\/www\/owncloud\/3rdparty\/guzzlehttp\/guzzle\/src\/Client.php(165): GuzzleHttp\\RequestFsm->__invoke(Object(GuzzleHttp\\Transaction))\
n#7 \/var\/www\/owncloud\/3rdparty\/guzzlehttp\/guzzle\/src\/Client.php(125): GuzzleHttp\\Client->send(Object(GuzzleHttp\\Message\\Request))\
n#8 \/var\/www\/owncloud\/lib\/private\/Http\/Client\/Client.php(136): GuzzleHttp\\Client->get('http:\/\/owncloud...', Array)\
n#9 \/var\/www\/owncloud\/apps\/federation\/lib\/BackgroundJob\/GetSharedSecret.php(156): OC\\Http\\Client\\Client->get('http:\/\/owncloud...', Array)\
n#10 \/var\/www\/owncloud\/lib\/private\/BackgroundJob\/Job.php(52): OCA\\Federation\\BackgroundJob\\GetSharedSecret->run(Array)\
n#11 \/var\/www\/owncloud\/apps\/federation\/lib\/BackgroundJob\/GetSharedSecret.php(134): OC\\BackgroundJob\\Job->execute(Object(OC\\BackgroundJob\\JobList), Object(OC\\Log))\
n#12 \/var\/www\/owncloud\/apps\/federation\/lib\/BackgroundJob\/GetSharedSecret.php(119): OCA\\Federation\\BackgroundJob\\GetSharedSecret->parentExecute(Object(OC\\BackgroundJob\\JobList), Object(OC\\Log))\
n#13 \/var\/www\/owncloud\/cron.php(144): OCA\\Federation\\BackgroundJob\\GetSharedSecret->execute(Object(OC\\BackgroundJob\\JobList), Object(OC\\Log))\n#14 {main}","File":"\/var\/www\/owncloud\/3rdparty\/guzzlehttp\/guzzle\/src\/Exception\/RequestException.php","Line":49}

This one was present a couple of times

Error files
Exception: {"Exception":"BadMethodCallException","Message":"Not in batch","Code":0,"Trace":"
#0 \/var\/www\/owncloud\/lib\/private\/Files\/Utils\/Scanner.php(143): OC\\Files\\Cache\\Propagator->commitBatch()\
n#1 \/var\/www\/owncloud\/apps\/files\/lib\/BackgroundJob\/ScanFiles.php(87): OC\\Files\\Utils\\Scanner->backgroundScan('')\
n#2 \/var\/www\/owncloud\/apps\/files\/lib\/BackgroundJob\/ScanFiles.php(111): OCA\\Files\\BackgroundJob\\ScanFiles->runScanner(Object(OC\\User\\User))\
n#3 \/var\/www\/owncloud\/lib\/private\/BackgroundJob\/Job.php(52): OCA\\Files\\BackgroundJob\\ScanFiles->run(NULL)\
n#4 \/var\/www\/owncloud\/lib\/private\/BackgroundJob\/TimedJob.php(53): OC\\BackgroundJob\\Job->execute(Object(OC\\BackgroundJob\\JobList), Object(OC\\Log))\
n#5 \/var\/www\/owncloud\/cron.php(144): OC\\BackgroundJob\\TimedJob->execute(Object(OC\\BackgroundJob\\JobList), Object(OC\\Log))\
n#6 {main}","File":"\/var\/www\/owncloud\/lib\/private\/Files\/Cache\/Propagator.php","Line":145}

PVince81 commented 7 years ago

BadMethodCallException

https://github.com/owncloud/core/issues/25506

PVince81 commented 7 years ago

Not sure why the tokens would magically change. There are some cron jobs dealing with system address book syncing or so but I don't think it would change any tokens.

mmattel commented 7 years ago

I have a real problem now... I can not remove / delete the federated share. I always get "Storage not available" Rebooting the server did not change anything... Any advices please :disappointed_relieved:

Exception: {"Message":"HTTP\/1.1 503 Storage not available"

Exception: {"Exception":"OCP\\Files\\StorageNotAvailableException","Message":"",

mmattel commented 7 years ago

@PVince81 @DeepDiver1975

CRITICAL :bomb: :rage: (not personal)

Because I was not able to remove the fed_share, I decided to kill the instance and reset it up from scratch, means deleting the database and removing all files. No leftovers. I also reconnected to the other owncloud and had a working fed_share. Adding users and groups, assigning groups to folders on the fed_share, things were working fine, sync client also doing fine.

This was 6 days ago.

Today I got a call that the fed_share is inaccessible, again, storage not available, local stuff and SMB works. Nothing has been changed in the setup.

Error   PHP OCP\Files\StorageNotAvailableException:
at /var/www/owncloud/lib/private/Files/Storage/Wrapper/Availability.php#75

Fatal   webdav  Sabre\DAV\Exception\ServiceUnavailable: HTTP/1.1 503 Storage not available

Please let me know where I can look into helping to find the root cause I do / did not change anything to keep the system as it is, but we need to fix that. I can´t play this kill-reset game all the time...

PVince81 commented 7 years ago

When seeing "ServiceUnavailable", the first thing to check is "oc_storages" and see if the "shared::" storage of the federated share has been marked unavailable. Usually OC won't recheck for availability until after 10 minutes. update oc_storages set available=1, last_checked=null would reset the timer.

Now if after that the fed share is still unavailable, you might see a more detailed exception from the retry. If it does work again, then maybe it was unavailable for a short time (network issue?) and was marked unavailable for 10 minutes.

Let's see if you can find more clues this way.

mmattel commented 7 years ago

Yes, in oc_storages, available was marked 0 on the fed_share shared::xxxxx

After update oc_storages set available=1, last_checked=null I tried to access the fed_share - and it was accassible again.

Hmmm, after 30min, last_checked did not change on none of the entries. Now, it just changed, but only on the SMB share. Shouldnt it change on all entries?

Both ownClouds use a fix IP address and stable servers. My firewall had last week a dying powersupply interrupting my access sporadically, I changed it immediately, access is since days fine again, but this should not break things in owncloud...

PVince81 commented 7 years ago

@mmattel hmmm, then maybe there is a bug that it doesn't properly retries federates shares.

mmattel commented 7 years ago

Possibly, just checked oc_storages again, beside SMB, last_checked has not changed on other items.

ID last_checked home NULL local NULL smb 1473156462 shared NULL

PVince81 commented 7 years ago

Confirmed.

Since this has no direct relationship with the original issue about missing token / weird URL, raised a new one here: https://github.com/owncloud/core/issues/26037

mmattel commented 7 years ago

Thx for taking care of!

Q: are home and local storage types not /never avalability checked (because SMB was)? Q: is ist guaranteed that other storage types are checked?

PVince81 commented 7 years ago

@mmattel I think all storages are covered except local shares.

The logic is as follows, using a storage wrapper:

Storage is used normally (no availability check)
Whenever the exception "StorageNotAvailableException" occurs in any of the storage's methods, the availability wrapper catches it and sets the "availability" flag to 0
Later on, for any storage method call: if TTL not expired, continue throwing exception without check and reset TTL by calling setAvailability
Later on, for any storage method call: if TTL expired, do an availability check by calling Storage::test(). If true, reset the flag back to 1.

The bug was that Storage::test() didn't return true for federated shares, so it couldn't reset the flag.

In a regular workflow, the method Storage::test() is never called. (exception: for ext storages in the configuration page)

PVince81 commented 7 years ago

The bug was that Storage::test() didn't return true for federated shares, so it couldn't reset the flag.

so basically it was checked, but its result wasn't propagated properly!

mmattel commented 7 years ago

Thx for explaining. Three things coming to my mind:

What about renaming Storage not available to Storage temorary not available in browser/log
Wyh do we need a Fatal 503 Storage not available not just an Error ...
What about to log the possible check time and/or interval in case it is not available?

PVince81 commented 7 years ago

Maybe, yes. The trouble might be to find all code paths where this exception is logged and add a catch block to log it as error instead, and also convert the message if UI elements are involved.

e-alfred commented 7 years ago

I am also getting this "service unavailable" in the logs. I have only one file on a federated share on another Owncloud 9.1 instance (both running 9.1), and this works well for some time (weeks, days). Then, suddenly, the sync clients show me that this shared file got updated every few minutes. After some time, the sync client can't download the shared file anymore. If I try to delete the file in the web UI I get this error in the logs:

{"reqId":"V9u@Gn8AAQEAAFNSRRkAAAAB","remoteAddr":"1.1.1.1","app":"webdav","message":"Exception: {\"Message\":\"HTTP\\\/1.1 503 Storage not available\",\"Exception\":\"Sabre\\\\DAV\\\\Exception\\\\ServiceUnavailable\",\"Code\":0,\"Trace\":\"#0 \\\/var\\\/www\\\/owncloud\\\/3rdparty\\\/sabre\\\/dav\\\/lib\\\/DAV\\\/Tree.php(178): OCA\\\\DAV\\\\Connector\\\\Sabre\\\\ObjectTree->getNodeForPath('file.name')\\n#1 \\\/var\\\/www\\\/owncloud\\\/3rdparty\\\/sabre\\\/dav\\\/lib\\\/DAV\\\/CorePlugin.php(285): Sabre\\\\DAV\\\\Tree->delete('file.name')\\n#2 [internal function]: Sabre\\\\DAV\\\\CorePlugin->httpDelete(Object(Sabre\\\\HTTP\\\\Request), Object(Sabre\\\\HTTP\\\\Response))\\n#3 \\\/var\\\/www\\\/owncloud\\\/3rdparty\\\/sabre\\\/event\\\/lib\\\/EventEmitterTrait.php(105): call_user_func_array(Array, Array)\\n#4 \\\/var\\\/www\\\/owncloud\\\/3rdparty\\\/sabre\\\/dav\\\/lib\\\/DAV\\\/Server.php(459): Sabre\\\\Event\\\\EventEmitter->emit('method:DELETE', Array)\\n#5 \\\/var\\\/www\\\/owncloud\\\/3rdparty\\\/sabre\\\/dav\\\/lib\\\/DAV\\\/Server.php(248): Sabre\\\\DAV\\\\Server->invokeMethod(Object(Sabre\\\\HTTP\\\\Request), Object(Sabre\\\\HTTP\\\\Response))\\n#6 \\\/var\\\/www\\\/owncloud\\\/apps\\\/dav\\\/appinfo\\\/v1\\\/webdav.php(56): Sabre\\\\DAV\\\\Server->exec()\\n#7 \\\/var\\\/www\\\/owncloud\\\/remote.php(164): require_once('\\\/var\\\/www\\\/ownclo...')\\n#8 {main}\",\"File\":\"\\\/var\\\/www\\\/owncloud\\\/apps\\\/dav\\\/lib\\\/Connector\\\/Sabre\\\/ObjectTree.php\",\"Line\":159,\"User\":\"user\"}","level":4,"time":"2016-09-16T11:40:42+02:00","method":"DELETE","url":"\/owncloud\/remote.php\/webdav\file.name","user":"user"}

The server where this file is shared from is online, all the time.

To get it running again, I have to remove the offending entries from the oc_share_external table in the Mysql database. Then everything works again, and I can re-add the federated share as normal (until I run into the same problem again after some time).

PVince81 commented 7 years ago

@e-alfred did you try with 9.1.1RC2 which contains the fix from https://github.com/owncloud/core/issues/26037 ?

PVince81 commented 7 years ago

@mmattel I have improved the error message here https://github.com/owncloud/core/pull/26145

I'm closing this issue for now. Please reopen if it happens again with 9.1.1RC3 or final (which comes out in the coming days).

e-alfred commented 7 years ago

@PVince81 The bug still appears from time to time in 9.1.1. Until it happens, I get a message on every system with the desktop client installed that the offending file (for now it is only one) got downloaded every minute without an error (even though the file didn't change). If the error happens, the desktop clients show it as a failed sync.

{"reqId":"V-dIfn8AAQEAADRYU0oAAAAP","remoteAddr":"1.1.1.1","app":"webdav","message":"Exception: {\"Message\":\"HTTP\\\/1.1 503 Storage not available\",\"Exception\":\"Sabre\\\\DAV\\\\Exception\\\\ServiceUnavailable\",\"Code\":0,\"Trace\":\"#0 \\\/var\\\/www\\\/owncloud\\\/3rdparty\\\/sabre\\\/dav\\\/lib\\\/DAV\\\/CorePlugin.php(79): OCA\\\\DAV\\\\Connector\\\\Sabre\\\\ObjectTree->getNodeForPath('myfile.ext', 0)\\n#1 [internal function]: Sabre\\\\DAV\\\\CorePlugin->httpGet(Object(Sabre\\\\HTTP\\\\Request), Object(Sabre\\\\HTTP\\\\Response))\\n#2 \\\/var\\\/www\\\/owncloud\\\/3rdparty\\\/sabre\\\/event\\\/lib\\\/EventEmitterTrait.php(105): call_user_func_array(Array, Array)\\n#3 \\\/var\\\/www\\\/owncloud\\\/3rdparty\\\/sabre\\\/dav\\\/lib\\\/DAV\\\/Server.php(459): Sabre\\\\Event\\\\EventEmitter->emit('method:GET', Array)\\n#4 \\\/var\\\/www\\\/owncloud\\\/3rdparty\\\/sabre\\\/dav\\\/lib\\\/DAV\\\/Server.php(248): Sabre\\\\DAV\\\\Server->invokeMethod(Object(Sabre\\\\HTTP\\\\Request), Object(Sabre\\\\HTTP\\\\Response))\\n#5 \\\/var\\\/www\\\/owncloud\\\/apps\\\/dav\\\/appinfo\\\/v1\\\/webdav.php(56): Sabre\\\\DAV\\\\Server->exec()\\n#6 \\\/var\\\/www\\\/owncloud\\\/remote.php(164): require_once('\\\/var\\\/www\\\/ownclo...')\\n#7 {main}\",\"File\":\"\\\/var\\\/www\\\/owncloud\\\/apps\\\/dav\\\/lib\\\/Connector\\\/Sabre\\\/ObjectTree.php\",\"Line\":159,\"User\":\"alfred\"}","level":4,"time":"2016-10-07T09:02:22+02:00","method":"GET","url":"\/owncloud\/remote.php\/webdav\/myfile.ext","user":"alfred"}

Is it really necessary to fill up the logs/notifications on every client that the shared file got updated (even though it wasn't even changed)? Also, if there is a minimal outage (seconds to minutes) showing a full blown error 503 on every client with a red "X" is a bit overblown as well.

lock[bot] commented 5 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

owncloud / core

9.1 stable Federated share not longer available (storage not available) #25854