nextcloud / android

📱 Nextcloud Android app
https://play.google.com/store/apps/details?id=com.nextcloud.client
GNU General Public License v2.0
4.21k stars 1.75k forks source link

Synced files deleted if proxy returns an unexpected response (e.g. 404) when Nextcloud service is offline #12522

Open NeXX451 opened 8 months ago

NeXX451 commented 8 months ago

⚠️ Before posting ⚠️

Steps to reproduce

  1. set up VPN connection to nextcloud server
  2. sync some files
  3. kill vpn connection such that nextcloud client looses connection
  4. wait a few hours until programs such as davx start complaining about lost connection
  5. synced files are now deleted

Expected behaviour

The synced files should not be deleted under any circumstances unless the user explicitly deletes them. The client loosing connection to the server should not result in synced files being deleted.

Actual behaviour

I have asked this question on the help.nextcloud.com forum, so I will just copy and paste.

For whatever reason when my phone (Samsung S23, Nextcloud Android app, version 3.27.0) looses connection to my nextcloud server, it decides that all the synced files are to be deleted. This happened three times already and I have to re-download (resync) everything, so you can imagine this is supremely frustrating.

I access my nextcloud server via a vpn connection, so if the vpn server dies → nextcloud connection dies. I woke up today with a bunch of DavX notification that all read along the lines “cannot sync”, “no connection”, etc. I promptly restarted the server. If there are any app logs I can share with you, just tell me where they are and I shall share them.

It does not start syncing the files on it’s own, which is strange in its own way, the files are completely gone from the phone and I have to go into the folder and click on the three dots → sync.

Also I have noticed it also happens on my tablet as well.

Android version

14

Device brand and model

Samsung S23, Samsung Galaxy Tab S4

Stock or custom OS?

Stock

Nextcloud android app version

3.27.0

Nextcloud server version

28.0.1

Using a reverse proxy?

Yes

Android logs

No response

Server error logs

No response

Additional information

I have not found any logs on my phone unfortunately. The folder /data/ does not exist on my phone.

joshtrichards commented 7 months ago

I assume green check marks on the file(s) remain and it's purely the on-disk files disappearing?

Can you trigger the same condition just by putting you device in offline/airplane mode?

What path are you using? The one without tmp in it? Or this is through the documents provider maybe?

If the green checkmark is disappearing, it would be interesting to see logs from the client app to see why it is "unsyncing" mysteriously.

There are a couple ways of getting logs...

The easiest may be to install the Dev edition of the app since it can be installed alongside your regular edition. It'll use its own storage folder so as long as you don't configure an overlapping auto-upload folder, there should be no conflict. In the Dev app you can access the logs directly within the app via Settings->Logs. The Dev app can be installed via F-Droid. You can even use it with a separate test account on your Nextcloud instance if you wish to further isolate any testing.

Alternatively you can get logs from your existing installation by using logcat.

EDIT: Clarified my initial response.

NeXX451 commented 7 months ago

No even the green check-marks disappear.

I'll see what happens in offline/airplane mode and let you know. I suspect in offline-mode nothing will happen, but in airplane-mode it will block the vpn connection and result in the ominous desync.

I haven't messed with the path at all, it's the default Internal storage/Android/media/com.nextcloud.client/nextcloud/user@domain/

I'll also look into the logs with the dev-version.

NeXX451 commented 7 months ago

Ok I have reproduced it. It won't work with either offline or airplane mode as I suspect the app takes notice and doesn't do anything. So what I have done is actually stopped the nextcloud server altogether, which should be equivalent to VPN loosing connection. I also have the logs

What I have noticed is that, the moment the server goes offline, the client completely forgets that there ever were some files, and instead of showing me the folders (from cache or something) all it shows is "No files here", which then results in deletion.

Edit: I uploaded logs from main account.. I will upload test account logs in a bit.

NeXX451 commented 7 months ago

The logs: logs.txt

joshtrichards commented 7 months ago

Thanks for the logs. I'll take a look. What type of VPN connection is this, btw?

NeXX451 commented 7 months ago

No problem. As I have said it's not related to the vpn connection, since the deletion occurs when I turn the nextcloud server off (connected from local network to the server, so vpn is not needed). It's OpenVPN

NeXX451 commented 7 months ago

Ok, I can norrow it down further. Today I had to turn off the whole server and nothing was deleted. So if connection to the server exists but the nextcloud instance is turned off it desyncs, since it connects to the server but can't contact the instance.

borisdigital commented 7 months ago

Looks like I'm having equal issues with the android app. I'm not using a VPN, but from time to time the nextcloud server goes down (mostly because of diy layer 8 docker autoupdate problems). This always seems to delete all synced files on the android device. I normally recognize the absence of the files, when I try to play one of my m3u lists in vlc...

joshtrichards commented 7 months ago

Ok, I can norrow it down further. Today I had to turn off the whole server and nothing was deleted. So if connection to the server exists but the nextcloud instance is turned off it desyncs, since it connects to the server but can't contact the instance.

How are you "turning off Nextcloud" when you trigger this condition? And what reverse proxy are you using?

Based on your logs it appears the HTTPS still on the other end of the connection starts happily returning 404s (file not found) to our checks for the existence of files/etc (i.e. whatever is still responding server-side is behaving as if it's the actual server we're trying to talk to instead of timing out like I'd expect the proxy to do with a 502 Bad Gateway or similar). A proxy won't return 404s if the target service is offline typically so something weird is going on here.

Can you provide the output of curl -I https://<your_domain while Nextcloud is offline (when things are not working)?

NeXX451 commented 7 months ago

All is running in docker. I'm using Traefik as reverse proxy. I also I have a pihole instance, where I have a custom DNS record pointing to the nextcloud server. Since I can't be bothered to actually expose anything I run everything locally, but I still decided to use self-signed certificates, so Traefik uses a self-signed certificate for the custom DNS record.

Turning off nextcloud boils down to shutting down the container, but since traefik is still up, it will return 404 (my guess). At this point I am not sure if that's intended behaviour or not... I mean if it isn't than that means I screwed up my config somewhere along the line, but would that even matter?

Since the certificate for https is self-signed I ran curl with -k

curl -Ik https://nc.vpn                                                                                      ✔ 
HTTP/2 404 
content-type: text/plain; charset=utf-8
x-content-type-options: nosniff
content-length: 19
date: Tue, 05 Mar 2024 17:36:34 GMT
NeXX451 commented 7 months ago

Actually I can also post the config, maybe that helps..

Traefik:

version: '3'

services:
  traefik:
    image: traefik
    command:
      - "--log.level=DEBUG"
      - "--entrypoints.websecure.address=:443"
      - "--providers.docker=true"
      - "--providers.file.directory=/certs/config/"
      - "--providers.file.watch=true"
    container_name: traefik
    restart: always
    security_opt:
      - no-new-privileges:true
    networks:
      - proxy
    ports:
      - 80:80
      - 443:443
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /var/run/docker.sock:/var/run/docker.sock:ro

      - /mnt/NC/self_sign/compose/certs/:/certs/
      - /mnt/NC/self_sign/compose/certs/config/:/certs/config/

      - ./data/logs:/var/log/traefik
    labels:
      - "traefik.http.middlewares.sslheader.headers.customrequestheaders.X-Forwarded-Proto=https"

networks:
  proxy:
    external: true

and in the nextcloud docker-compose:

    labels:

      - "traefik.enable=true"
      - "traefik.http.routers.nextcloud.rule=Host(`nc.vpn`)"
      - "traefik.http.routers.nextcloud.entrypoints=websecure"
      - "traefik.http.routers.nextcloud.tls=true"

      - "traefik.docker.network=proxy"
borisdigital commented 7 months ago

All is running in docker. I'm using Traefik as reverse proxy. I also I have a pihole instance, where I have a custom DNS record pointing to the nextcloud server. Since I can't be bothered to actually expose anything I run everything locally, but I still decided to use self-signed certificates, so Traefik uses a self-signed certificate for the custom DNS record.

Turning off nextcloud boils down to shutting down the container, but since traefik is still up, it will return 404 (my guess). At this point I am not sure if that's intended behaviour or not... I mean if it isn't than that means I screwed up my config somewhere along the line, but would that even matter?

Since the certificate for https is self-signed I ran curl with -k

curl -Ik https://nc.vpn                                                                                      ✔ 
HTTP/2 404 
content-type: text/plain; charset=utf-8
x-content-type-options: nosniff
content-length: 19
date: Tue, 05 Mar 2024 17:36:34 GMT

I'm running exactly the same webserver-proxy-config (traefik + nextcloud in docker, but w/o vpn). When the actual backend (nextcloud-docker) goes down, traefik reports a 404 (this is correct behaviour on traefik side). But whatever makes the nexctcloud app eat all synched data on the client when a 404 is returned, is a huge pita and cannot be correct.

joshtrichards commented 7 months ago

Turning off nextcloud boils down to shutting down the container, but since traefik is still up, it will return 404 (my guess). At this point I am not sure if that's intended behaviour or not... I mean if it isn't than that means I screwed up my config somewhere along the line, but would that even matter?

I'm running exactly the same webserver-proxy-config (traefik + nextcloud in docker, but w/o vpn). When the actual backend (nextcloud-docker) goes down, traefik reports a 404 (this is correct behaviour on traefik side).

A 404 is definitely not correct nor standard behavior for a reverse proxy when the back-end is merely unreachable/offline. It's too permanent of a response code.

Even if we weren't talking about Nextcloud, handling traffic this way on a typical web site would result in it's search appearances getting destroyed because a 404 is a permanent failure. More typical is a 502/503/504.

Perhaps this is only happening due to the dynamic configuration done via Compose. I'm not particularly familiar with traefik, but i can see how it might not see this as an issue with the back-end, but with the front-end. Because what's really going on here is that the proxy config for Nextcloud (i.e. the front-end config) is completely disappearing from the environment when you take that container is offline. In that case the 404 is understandable: the proxy rightfully considers that URL (the front-end) not simply as off-line, but non-existent.

However, that's still not reasonable behavior in a production environment and it's only happening as a byproduct of this particular approach to configuring Traefik. Traefik needs to be reconfigured to not do this.

https://doc.traefik.io/traefik/getting-started/faq/

But whatever makes the nexctcloud app eat all synched data on the client when a 404 is returned, is a huge pita and cannot be correct.

If we send along our auth info to the URL (one we've been configured to trust is your Nextcloud Server) and send a GET or PROPFIND to a WebDAV URL on it - and the remote accepts our credentials and sends us a 404 - that's literally the definition of how WebDAV works (and in this case it indicates the remote file doesn't exist).

NeXX451 commented 7 months ago

I see, so in your opinion I should reconfig traefik such that it returns 503 instead 404 when the server's down?

borisdigital commented 7 months ago

Turning off nextcloud boils down to shutting down the container, but since traefik is still up, it will return 404 (my guess). At this point I am not sure if that's intended behaviour or not... I mean if it isn't than that means I screwed up my config somewhere along the line, but would that even matter?

I'm running exactly the same webserver-proxy-config (traefik + nextcloud in docker, but w/o vpn). When the actual backend (nextcloud-docker) goes down, traefik reports a 404 (this is correct behaviour on traefik side).

A 404 is definitely not correct nor standard behavior for a reverse proxy when the back-end is merely unreachable/offline. It's too permanent of a response code.

Even if we weren't talking about Nextcloud, handling traffic this way on a typical web site would result in it's search appearances getting destroyed because a 404 is a permanent failure. More typical is a 502/503/504.

Perhaps this is only happening due to the dynamic configuration done via Compose. I'm not particularly familiar with traefik, but i can see how it might not see this as an issue with the back-end, but with the front-end. Because what's really going on here is that the proxy config for Nextcloud (i.e. the front-end config) is completely disappearing from the environment when you take that container is offline. In that case the 404 is understandable: the proxy rightfully considers that URL (the front-end) not simply as off-line, but non-existent.

However, that's still not reasonable behavior in a production environment and it's only happening as a byproduct of this particular approach to configuring Traefik. Traefik needs to be reconfigured to not do this.

https://doc.traefik.io/traefik/getting-started/faq/

But whatever makes the nexctcloud app eat all synched data on the client when a 404 is returned, is a huge pita and cannot be correct.

If we send along our auth info to the URL (one we've been configured to trust is your Nextcloud Server) and send a GET or PROPFIND to a WebDAV URL on it - and the remote accepts our credentials and sends us a 404 - that's literally the definition of how WebDAV works (and in this case it indicates the remote file doesn't exist).

Well, you have an argument here, but i don't think a 404 is a permanent error (see https://datatracker.ietf.org/doc/html/rfc7231#section-6.5.4 referenced by traefik docs here https://doc.traefik.io/traefik/getting-started/faq/#404-not-found).

Btw, it seems like traefik can be told to send another http status code (see https://doc.traefik.io/traefik/getting-started/faq/#xxx-instead-of-404), but i still do not like the android app deleting my files (the desktop app seems to keep all files). Maybe this could be a way around this problem…

borisdigital commented 7 months ago

I see, so in your opinion I should reconfig traefik such that it returns 503 instead 404 when the server's down?

Did you try out https://doc.traefik.io/traefik/getting-started/faq/#xxx-instead-of-404?

NeXX451 commented 7 months ago

I see, so in your opinion I should reconfig traefik such that it returns 503 instead 404 when the server's down?

Did you try out https://doc.traefik.io/traefik/getting-started/faq/#xxx-instead-of-404?

No, not yet. I don't really have the time now to muck about with traefik. Plus, I feel the android app deleting files should not be considered as intended behavior, since, as you have mentioned, the desktop app doesn't seem to be affected whatsoever.

joshtrichards commented 7 months ago

Did you try out https://doc.traefik.io/traefik/getting-started/faq/#xxx-instead-of-404?

It may not be necessary to make such a drastic change globally.

The following sounds a bit like what may be needed more directly:

https://doc.traefik.io/traefik/providers/docker/#allowemptyservices

joshtrichards commented 7 months ago

i still do not like the android app deleting my files (the desktop app seems to keep all files). Maybe this could be a way around this problem…

A quick glance suggests the desktop client has made various incremental adjustments to try to accommodate a wider variety of bogus server-side responses in some spots. Some of the the earlier ways of dealing with it look like just spitting out an error (and that's still the case at times).

I'm not saying we can't be more robust in what we accept. I just don't want to say we're going to do anything without looking closer at it. This requires a lot of delegate case-by-case logic analysis to assess the potential side effects.

That introduces more complexity... and its own risks. So I'd still suggest adjusting the proxy behavior for the time being. :-)

NeXX451 commented 7 months ago

I see, so in your opinion I should reconfig traefik such that it returns 503 instead 404 when the server's down?

Did you try out https://doc.traefik.io/traefik/getting-started/faq/#xxx-instead-of-404?

Can you try this out https://doc.traefik.io/traefik/providers/docker/#allowemptyservices ? If you manage to get it to work let me know please. I cursorily tried it out, but curl still returns 404 instead of 503..

borisdigital commented 7 months ago

I see, so in your opinion I should reconfig traefik such that it returns 503 instead 404 when the server's down?

Did you try out https://doc.traefik.io/traefik/getting-started/faq/#xxx-instead-of-404?

Can you try this out https://doc.traefik.io/traefik/providers/docker/#allowemptyservices ? If you manage to get it to work let me know please. I cursorily tried it out, but curl still returns 404 instead of 503..

Same here. Doesn't work here. When i stop the nc container, the router is removed, traefik forgets about it and reports a 404. Not sure if i'm using the option correctly (added it to traefik.yaml).

OTH this workaround seems to be ok, at least it returns a http status code 503 instead of 404 when the container goes down (see -> https://doc.traefik.io/traefik/getting-started/faq/#xxx-instead-of-404). Since nextcloud is the only app on this vm, it's ok for me to change this globally.