Open pentix opened 4 years ago
Probably relevant to this:
A GET timeout should possibly only be considered a SoftError?
Would it be required to set propagator()->_anotherSyncNeeded = true
in such cases?
I'd say going to SoftError would be dangerous (it might be assumed down the line that the file is in a "better" state than it is). Maybe reclassifying as a NormalError would be doable but that'd need to be heavily tested.
Obviously I'd rather we find a proper fix to avoid those timeouts in the first place. They're supposed to be transient situation that's why the FatalError is a safe bet: when that happens it's likely everything else asked to the server will fail as well with further timeouts. Thus continuing the sync is generally not going to help get the server load down for instance.
If that's something happening often, it's better to fix the root cause which can be:
For the last one there's obviously something at play around E2EE right now, spotted in 3.0.2, tried to workaround it for 3.0.3. I'm in discussion with the E2EE app maintainer to try to move this forward.
Obviously I'd rather we find a proper fix to avoid those timeouts in the first place.
Absolutely! I know the connectivity is not the best and I'm not blaming NextCloud for it, there is actually a different bunch of factors to this.
Thus continuing the sync is generally not going to help get the server load down for instance.
I agree constant retrying might not be a good approach, however I think a retry every few minutes would not hurt. Going to the FatalError state and waiting for a person to manually restart sync is not really reducing the workload either :grimacing:
I assume if the status is set to NormalError, the normal sync process will still be triggered? (i.e. 30s later?)
I'm in discussion with the E2EE app maintainer to try to move this forward.
Thanks a lot!
I assume if the status is set to NormalError, the normal sync process will still be triggered? (i.e. 30s later?)
Got to admit I'm not 100% sure but that'd be my expectation though. It might ignore the file for a little while as well (it's generally not long though, for the next sync or so).
Thank you.
I'd love to try this (i.e. insert some debug logs for me to understand the error handling and classify such errors as NormalError) but I was struggling setting up the entire Windows Build Environment. In case that would work out eventually, I'll report my insights back to you.
I've been working on a check list for windows builds. Paint is still fresh that's why I didn't commit it yet. If you want to try that alternative route and let me know how this goes for you here it is (in your context the dokany step is optional):
We're assuming the following to be installed before this procedure starts:
Steps:
Thank you! I'll give this a shot as soon as I find some time and a clean Windows VM.
(When looking for the build process used I saw the appveyor.yml
files but I guess they are outdated now, at least non-functional at the moment?)
Yes... I'd love to see someone revive the appveyor so that we got Windows coverage. Didn't find time for it yet.
So far things went smooth, but what root are you referring to?
* Open a new admin cmd then: * setx PATH "%PATH%;path\to\root"
So far things went smooth, but what root are you referring to?
The root folder in which zlib got installed. See a few steps above. Not necessarily nicely named... I'll change it.
Ah whoops, I totally missed this when copying the other instructions.
I assume we add the bin
folder inside the root
folder to the PATH, not the root folder itself?
Edit: Wrong assumption, it's only used later on to find zlib development files, I figured
Any updates on this ? It's kinda bad if the client can't recover from connection failure which can happen through updates, outages etc and people may not notice or have enough experience to handle it. I've had this multiple times now.
Also the task icon stays on error, it's also displayed in the main app. But the menu you can get when clicking on the connection with recent file transfers says that everything is ok.
This bug report did not receive an update in the last 4 weeks. Please take a look again and update the issue with new details, otherwise the issue will be automatically closed in 2 weeks. Thank you!
The issue is the same as in 2020. Same problems. Can we mark this as non-resolved or do I need a bot now to counter this bot ?
Any updates on this ? It's kinda bad if the client can't recover from connection failure which can happen through updates, outages etc and people may not notice or have enough experience to handle it. I've had this multiple times now.
Same! In the meantime, I've identified the error in the network that leads to those sudden disconnects. As I stated in the beginning, NextCloud is not the cause. It seems to reconnect sometimes, for example if the network just goes out, it reconnects later on. If, however, the requests just time out, or there's a temporary certificate error (due to e.g. SSL reverse proxy being down) it would just stop and not retry at all. I modified the nextcloud-client, going the "most aggressive" way, i.e. setting the error to SoftError
and setting anotherSyncRequired = true
, whenever there's an HTTP error. Sadly, the above mentioned errors persist :(
Yeah the problem ist that if you get spurious network failures the client will just stop working, even though it may be only 1-2 minutes and working on the rest of the day. But people will then be disconnected until somebody tells them to restart their client.
@0xpr03 Exactly this!
Maybe @er-vin could check it out? Do you have a guess in what state the client might hang, once it received the QNetworkReply::OperationCanceledError
?
Maybe @er-vin could check it out? Do you have a guess in what state the client might hang, once it received the
QNetworkReply::OperationCanceledError
?
Unfortunately not... the problem with this kind of seemingly random events is that they're hard to reproduce at all.
Maybe @er-vin could check it out? Do you have a guess in what state the client might hang, once it received the
QNetworkReply::OperationCanceledError
?Unfortunately not... the problem with this kind of seemingly random events is that they're hard to reproduce at all.
Setup nextcloud, connect from the desktop client, kill the nextcloud VM, restart after ~5 minutes and watch it burn.
This bug report did not receive an update in the last 4 weeks. Please take a look again and update the issue with new details, otherwise the issue will be automatically closed in 2 weeks. Thank you!
Still active
This bug report did not receive an update in the last 4 weeks. Please take a look again and update the issue with new details, otherwise the issue will be automatically closed in 2 weeks. Thank you!
Still active
I can confirm the behavior. For me, the error occurs regularly and exactly when the router (Fritz!Box) closes and reopens the Internet connection before it is forcibly disconnected and re-established by the provider once a day. The Nextcloud client polls the server about every 30 seconds and if during this time the connection is being closed and re-established, the client does not get back on its feet by itself (usually with an SSL handshake error). The only solution is to terminate the client and restart it. The client should at least be fault-tolerant enough to try to reestablish the connection as if the client was restarted.
I have added a small part of the log. If desired I can provide more infos (log files, etc...).
2021-04-29 01:25:59:832 [ warning nextcloud.sync.networkjob ]: SslHandshakeFailedError: "Im Ablauf des SSL-Protokolls ist ein Fehler aufgetreten." : can be caused by a webserver wanting SSL client certificates 2021-04-29 01:25:59:832 [ warning nextcloud.sync.networkjob ]: QNetworkReply::SslHandshakeFailedError "Im Ablauf des SSL-Protokolls ist ein Fehler aufgetreten." QVariant(Invalid) 2021-04-29 01:25:59:832 [ warning nextcloud.sync.credentials.webflow ]: QNetworkReply::SslHandshakeFailedError 2021-04-29 01:25:59:832 [ warning nextcloud.sync.credentials.webflow ]: "Im Ablauf des SSL-Protokolls ist ein Fehler aufgetreten." 2021-04-29 01:25:59:832 [ info nextcloud.sync.networkjob.propfind ]: PROPFIND of QUrl("https://xxxxxxxxx.yyyyyyyyy.myfritz.net/remote.php/dav/files/XXXX/") FINISHED WITH STATUS "SslHandshakeFailedError Im Ablauf des SSL-Protokolls ist ein Fehler aufgetreten." 2021-04-29 01:25:59:832 [ warning nextcloud.sync.networkjob.propfind ]: not successful, http result code is 0 "" 2021-04-29 01:25:59:832 [ info nextcloud.gui.account.state ]: AccountState connection status change: OCC::ConnectionValidator::Connected -> OCC::ConnectionValidator::SslError 2021-04-29 01:25:59:832 [ info nextcloud.gui.account.state ]: AccountState state change: "Verbunden" -> "Abgemeldet" 2021-04-29 01:25:59:832 [ debug nextcloud.gui.socketapi ] [ OCC::SocketListener::sendMessage ]: Sending SocketAPI message --> "UNREGISTER_PATH:O:\Nextcloud" to QLocalSocket(0x1c55f17a5e0) 2021-04-29 01:25:59:895 [ info nextcloud.gui.folder.manager ]: Account "XXXX@xxxxxxxxx.yyyyyyyyy.myfritz.net" disconnected or paused, terminating or descheduling sync folders 2021-04-29 01:25:59:895 [ debug nextcloud.sync.networkjob ] [ OCC::AbstractNetworkJob::slotFinished ]: Network job OCC::PropfindJob finished for "/" 2021-04-29 01:25:59:895 [ info nextcloud.sync.credentials.webflow ]: request finished 2021-04-29 01:25:59:895 [ warning nextcloud.sync.networkjob ]: SslHandshakeFailedError: "Im Ablauf des SSL-Protokolls ist ein Fehler aufgetreten." : can be caused by a webserver wanting SSL client certificates 2021-04-29 01:25:59:895 [ warning nextcloud.sync.networkjob ]: QNetworkReply::SslHandshakeFailedError "Im Ablauf des SSL-Protokolls ist ein Fehler aufgetreten." QVariant(Invalid) 2021-04-29 01:25:59:895 [ warning nextcloud.sync.credentials.webflow ]: QNetworkReply::SslHandshakeFailedError 2021-04-29 01:25:59:895 [ warning nextcloud.sync.credentials.webflow ]: "Im Ablauf des SSL-Protokolls ist ein Fehler aufgetreten."
and the following never ends...
2021-04-29 01:26:05:151 [ info nextcloud.gui.folder.manager ]: Etag poll timer timeout 2021-04-29 01:26:05:151 [ info nextcloud.gui.folder.manager ]: Folders to sync: 1 2021-04-29 01:26:05:151 [ info nextcloud.gui.folder.manager ]: Number of folders that don't use push notifications: 1 2021-04-29 01:26:05:151 [ info nextcloud.gui.folder.manager ]: Run etag job on folder OCC::Folder(0x1c55ca95730) 2021-04-29 01:26:05:151 [ info nextcloud.gui.folder.manager ]: Can not run etag job: Folder is busy 2021-04-29 01:26:35:144 [ info nextcloud.gui.folder.manager ]: Etag poll timer timeout 2021-04-29 01:26:35:144 [ info nextcloud.gui.folder.manager ]: Folders to sync: 1 2021-04-29 01:26:35:144 [ info nextcloud.gui.folder.manager ]: Number of folders that don't use push notifications: 1 2021-04-29 01:26:35:145 [ info nextcloud.gui.folder.manager ]: Run etag job on folder OCC::Folder(0x1c55ca95730) 2021-04-29 01:26:35:145 [ info nextcloud.gui.folder.manager ]: Can not run etag job: Folder is busy
Same tith if NC server behind VPN and VPN not up some time. 3.2.4 (macOS)
I have a similar problem with the client not recovering after timeout. Here the cause is that my desktop is not connected to the network while logging into X. If the client autostarts before the machine is connected to the network, even if only by a few seconds, it never recovers. Client version is 3.3.4, but it also happened with the previous version. Obviously the actual nextcloud install has no influence.
I have been observing this behavior for some time now and have noticed different scenarios during a planned short interruption of the fritzbox: sometimes the error occurs and the client reconnects, sometimes it doesn't. In both cases it can happen that a window "untrusted certificate" from different certificate issuers (e.g. Apple, Amazon,...) appears, but mine is from letsencrypt. If there are other logging or trace options, this could be very helpful to find out the cause of the problem. At least I could easily provide the client logfiles for each scenario.
Dont want to create another very simular issue... but the problem are not in bad connection to the server (its happens, but its.. okay)..
Windows 10 -11 goes to sleep -> no network connection about couple of seconds -> client dont work until until click on "log in". this is very annoying to do every time the computer wakes up
Please excuse me if this does not provide additional information. Facts I have right now:
The probably naive person in me thinks: Why is it not automatically retrying after like 60 seconds? If there is no technical reason, please point me to the file and I'll take a look as soon as I have time. It shouldn't be so hard to do even though I have no experience with the sync client and it's probably written in a language I haven't used in a while like Java.
This is important to me because my employees are not seeing the files and they are used to dropbox where "everything just works". I don't want them to think that dropbox is better just because of this bug.
@johannesE https://github.com/nextcloud/desktop/issues/2610#issuecomment-722333417 https://github.com/nextcloud/desktop/issues/2610#issuecomment-722231019
Qt is a little bit more to learn.
same problem here, on macos 10.13, tested different client versions from 2.4 to 3.4 that all end up on hold with error messages on the client logs. any idea how to overcome this major blockage?
Same issue on openSUSE Tumbleweed on GNOME and NetworkManager. Same issue occurs when I login and NetworkManager is not yet connected to the net. This last case can be solved using a delay on Nextcloud startup. But the sleep/suspend issue is truely breaking. I use Nextcloud for convenience but also for security by syncing to a regularly back upped location. Now the process requires me to intervene multiple times a day.
This appears to be the same issue as (or rather, the root cause of) issue #4084.
I have 4 systems, 2 Linux Mint 20.4, 2 Windows 10. All of them have issues connecting after resuming from standby. To initiate a reconnect you must change the proxy settings, restart the client completely, or wait a while. It does eventually reconnect for me (most of the time), but others have reported that it never reconnects.
You can see in the logs, it appears to fail for about 3 minutes before it finally connects. In that time, I am able to use the network, including browsing to the Nextcloud web UI and logging in, so the server seems to be functioning. Other NC clients on my other systems are able to sync with the server during this time.
Same here! Wake from standby, client is offline until I restart the client. At least a retry button would be nice.
not about standby, but also when using proxy. for example, when using ssh socks or other to connect to home server, if proxy not available on client start, nextcloud client seems like not trying it again, only restart helps...
Situation
The NextCloud sync client is installed on a Windows client, running mostly 24/7 connecting to a server over an unstable network. Sometimes the connection drops, however NextCloud enters a failed state which requires manual intervention, which is very unhandy, considering the fix is only to close and reopen the sync client.
Logs
The (anonymized) log shows the following timeout error:
Steps to reproduce
This occurs even there are ridiculously high connection timeout values set.
Client configuration
Client version: Client version 3.0.2stable-Win64 (build 20200924) Qt 5.12.9 SSL OpenSSL 1.1.1g 21 Apr 2020
Operating system: Windows Server 2008
OS language: German
Installation path of client: Default path