nextcloud / desktop

💻 Desktop sync client for Nextcloud
https://nextcloud.com/install/#install-clients
GNU General Public License v2.0
2.9k stars 772 forks source link

Syncing symbolic links as reference #250

Open EorlBruder opened 6 years ago

EorlBruder commented 6 years ago

Currently when I add a symlink inside my Nextcloud folder it doesn't sync because "Symbolic links are not supported in syncing". I am trying to use Nextcloud as a home for my data, to have it synced between my PCs. Now I'd also like to be able to access my git-repositories (mostly) code via the directory-structure in Nextcloud but I don't want to sync the git-repositories via Nextcloud. This is why it would be really nice for me to be able syncing only the link to the repository which is on a fixed position on any system. eg: ~/Nextcloud/current_projects/fizzbuzz would point to ~/repositories/fizzbuzz. If I happen to be on a system where this git-repository isn't present I can just clone the repository there. This option could be optional (like the syncing of hidden files).

There's also a issue about this for the owncloud-client here: https://github.com/owncloud/client/issues/1440

taminob commented 8 months ago

Good news, I am preparing a developement environemnt to test your changes @taminob .

Thanks, although currently there isn't too much to test yet. :) But will definitely be helpful if there is a first "working" version.

Regarding the link's representation in the server, it could be a real link, or maybe a virtual file (only metatada) in the db.

So far, I only considered "normal" files and actual symlinks - using virtual files with additional meta data could be a pretty good idea to solve a couple of issues. Thanks for the idea. :)

In my opinion, the MVP should already be secure, although usability doesn't have to be perfect. Thus, either preventing symlinks leading to the outside of the user's data home or ignoring them in the Web UI could be a valid approach for the MVP.

f1d094 commented 8 months ago

Good news, I am preparing a developement environemnt to test your changes @taminob .

Hooray!

Yes this issue #250 is for the client to sync symbolic links as a reference. (And I agree that the client dereferencing should not be implemented (see comments in the relevant issue)).

Hooray!

Regarding the sanitization of link's target. There might be usability issues (and we should carefully consider security related ones, remember that nextcloud supports sharing, federation, external storages and is in general a complex system). Yes for the MVP we could skip this. But we should vote for a secure and user friendly final implementation after all. The issues:

* links in folders that do not exist in different clients
* absolute links that do not make sense in different oses or do not exist in different clients

It seems to me that the simplest way to handle these would be to sync all to server and sync none to clients that do not support them. So, Windows .lnk files only go to/from windows client, Unix symlinks to/from posix clients (OSX, Linux, etc) and so on. This avoids complexity. It goes under the category of "not your problem". It isn't the responsibility of the Nextcloud team to solve longstanding multi-platform integration issues that have plagued the world since Windows first connected to a unix system. ;)

* We don't break all possible nextcloud deplyments

Agreed 100%. Stability should always be job #1. That said, the current configuration does interfere with the standard operation of *nix environments by dereferencing symlinks...I am very surprised that more users aren't up in arms about this.

* Any security issues (either in client or in server side)

I would argue that client-side, this isn't really the responsibility of the Nextcloud team to address beyond the function of the Nextcloud application. If I want to shoot my own foot off, please let me...but as someone who does systems penetration testing professionally, I honestly can't think of any malicious use case on the client side as long as the nextcloud client doesn't dereference/follow the symlinks.

Server-side, I would recommend a minor change which I've made: Instead of recommending "+FollowSymLinks" for apache2.conf, change to "+SymLinksIfOwnerMatch". Correctly deployed environments should have a discrete user assigned for the webserver process and this small change should prevent any links being able to reach outside the webroot. This doesn't prevent malicious users from pointing to nextcloud's operational files and other edge cases however, which may be a consideration.

As I am thinking it, if we put and opt-in (default off) switch for the desktop client (per client instance), to sync links (even with out sanitizing the link's target), this could be acceptable as a power user feature, which will not affect the majority of users. Also this would not affect existing configurations during the client upgrade for the version supporting symlinks.

I would be so happy I would do kartwheels. A great way to roll out for MVP

Regarding the server, and more specifically for the web UI a secure representation is to present the link as a special file of type link (and not let the server's OS try to dereference the link and present it to the WEB UI) (say I upload a link afile->/etc/passwd, when I open the file in the webUI I should not see the file's content). Regarding the link's representation in the server, it could be a real link, or maybe a virtual file (only metatada) in the db. (I am curious what did the the test with the PUT request that @taminob made)

See my note above. SymLinksIfOwnerMatch in the web server config should handle this nicely. The links can then safely point to whatever users want. Maybe I want a link to /etc/passwd as part of some dev project? That can be legit...I would be more wary of /etc/shadow of course...but again: should not be Nextcloud's problem to solve.

As a general note for symbolic links in *X systems (and Junctions (aka mklink) if Windows NTFS filesystems (and not .lnk files) ) are indeed special files from the OS side. This means they are not regular files (like pdf with html links inside, or windows .lnk files). This means that there are special system calls and C's std library functions to deal with them. Actually. the default for a system call open("/path/to/symlink) in a Linux system is to open the target flle (dereference the link). Witch means that we need another call (e.g. readlink) to read the link's target.

Indeed, the system call is different. When opening the file with open(), you may be able to use the flag O_NOFOLLOW. This will tell open() not to follow symlink if there is one. However, you will still need to know if the file you opened was a symlink (but not followed) or not. To do this, use the file descriptor returned from open(). Look into lstat() and check the st_mode field of the struct stat for S_IFLNK.

Many thanks to @taminob and @basos9 for working to finally resolve this languishing issue!

f1d094 commented 8 months ago

Good news, I am preparing a developement environemnt to test your changes @taminob .

Thanks, although currently there isn't too much to test yet. :) But will definitely be helpful if there is a first "working" version.

Regarding the link's representation in the server, it could be a real link, or maybe a virtual file (only metatada) in the db.

So far, I only considered "normal" files and actual symlinks - using virtual files with additional meta data could be a pretty good idea to solve a couple of issues. Thanks for the idea. :)

In my opinion, the MVP should already be secure, although usability doesn't have to be perfect. Thus, either preventing symlinks leading to the outside of the user's data home or ignoring them in the Web UI could be a valid approach for the MVP.

Please don't try and parse symlinks and make any logical determinations from them...a symlink should be allowed to point to anything and be sync'd intact. Protecting resources should be left to the webserver config and the nextcloud system application function. On any given client, there are a million ways to wind up with symlinks pointing to any number of things that 'USER' should not or does not have access to. It is not the responsibility of any software to police that...that is left to ACLs, permissions, etc.

Nextcloud, as an application, should have built-in funcationality to prevent access to any of its internal components and should not be relying on any individual sub-module, plugin, app, etc to be policing this...as I mentioned in the other post I would recommend that the default Nextcloud install use +SymLinksIfOwnerMatch instead of "+FollowSymLinks" for example...this puts enforcement where it belongs.

taminob commented 8 months ago

See my note above. SymLinksIfOwnerMatch in the web server config should handle this nicely. The links can then safely point to whatever users want. Maybe I want a link to /etc/passwd as part of some dev project? That can be legit...I would be more wary of /etc/shadow of course...but again: should not be Nextcloud's problem to solve.

While I agree that it will be very helpful to be able to have symlinks to outside of the data directory, I don't think it'll be that easy. For the MVP, we can hide behind the 'localstorage.allowsymlinks' => false option - which already states that it is a security risk to enable it. However, for a production multi-user system that will be an issue because you can peek into other user's home directories and e.g. into config/config.php which contains database passwords etc. (all the same OS user). I don't really think, dereferencing makes sense on the server, but I have to find every place where this might happen or it introduces a security bug. That's the reason why the proposed solution via virtual files sounds kind of neat.

Indeed, the system call is different. When opening the file with open(), you may be able to use the flag O_NOFOLLOW. This will tell open() not to follow symlink if there is one. However, you will still need to know if the file you opened was a symlink (but not followed) or not. To do this, use the file descriptor returned from open(). Look into lstat() and check the st_mode field of the struct stat for S_IFLNK.

I don't really have any trouble with the desktop client - >80% of the time spent on this issue so far have been on the server side because these are basically the first PHP lines I've ever been in touch with. :)

Please don't try and parse symlinks and make any logical determinations from them...a symlink should be allowed to point to anything and be sync'd intact. Protecting resources should be left to the webserver config and the nextcloud system application function. On any given client, there are a million ways to wind up with symlinks pointing to any number of things that 'USER' should not or does not have access to. It is not the responsibility of any software to police that...that is left to ACLs, permissions, etc.

Nextcloud, as an application, should have built-in funcationality to prevent access to any of its internal components and should not be relying on any individual sub-module, plugin, app, etc to be policing this...as I mentioned in the other post I would recommend that the default Nextcloud install use +SymLinksIfOwnerMatch instead of "+FollowSymLinks" for example...this puts enforcement where it belongs.

No worries, I don't try to limit anything - as long as it won't hurt security (although we might be able to hide for the initial implementation behind that option mentioned above). If we just never dereference them on the server, there will also no issue. I already synced many symlinks with newlines, quotes and other weird stuff in them in my testing. :)

basos9 commented 7 months ago

Hello,

I would argue that client-side, this isn't really the responsibility of the Nextcloud team to address beyond the function of the Nextcloud application. If I want to shoot my own foot off, please let me...but as someone who does systems penetration testing professionally, I honestly can't think of any malicious use case on the client side as long as the nextcloud client doesn't dereference/follow the symlinks.

This seems logical. Again with opt in (default off) behavior. Since the security issues might be from bad user interaction or bad software interaction. (somebody I think had said something relavant) But here is an imaginary one. I have a nextcloud synced folder that has music files, and then I have a web application for a media server service that same directory. Then I accept user shares. And Somebody shares with me. a link with /etc/passwd (or shadow or whatever). That means that the shared file appears on my sync root. Then the media server software has a bug that allows anybody to visit www.domain.com/player/getfile/passwd and serve that file. And such

So again thinking out loud, even if we finally do not sanitize symlink targets to be on the sync root, It might make sense for Sharing symlinks to be disabled, or introduce an option to enable them. Anyway.

That's the reason why the proposed solution via virtual files sounds kind of neat.

Yes, actually there is no need and could indeed introduce bugs, for symbolic links to be stored as such server side. Remember that each file served had a db record for it's metadata. The requirement is to sync sym links between clients. In the server we do not want this per se.

. If we just never dereference them on the server, there will also no issue. I already synced many symlinks with newlines, quotes and other weird stuff in them in my testing. :)

Yes, If !

But not ! localstorage.allowsymlinks should not be involved here. As it seems this is an advanced admin setting for the server. It enabled a server admin to put parts of a directory of the data dir point to other places. BUT the server code should not be able to create (real) symlinks. Only the server admin (e.g. via cmd, ln -s). This leads to storing client symlinks as virtual files (or special files with appropriate meta data).

I managed to test the above scenario with the master branch with the docker development env with modifications

localstorage.allowsymlinks' => true

Then I created two links inside admin's shared folder

$ ls -l workspace/server-data/admin/files/
total 8620
-rw-r--r-- 1 user user 8822513 Νοε   9 13:02  Nextcloud_Server_Administration_Manual.pdf
lrwxrwxrwx 1 user user      11 Νοε   9 13:08  pas -> /etc/passwd
lrwxrwxrwx 1 user user      33 Νοε   9 13:08  testlink -> /nonexistent

as you see pas is linking to /etc/passwd ON the server and testlink to a non existent file.

Then

runned

occ files:scan --all

and then from the web file pas appeared

and on the local file system running client

cat  workspace/desktop-test/data/pas 
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin

Not good.

taminob commented 7 months ago

But not ! localstorage.allowsymlinks should not be involved here. As it seems this is an advanced admin setting for the server. It enabled a server admin to put parts of a directory of the data dir point to other places. BUT the server code should not be able to create (real) symlinks. Only the server admin (e.g. via cmd, ln -s). This leads to storing client symlinks as virtual files (or special files with appropriate meta data).

Fair point, could make sense to add another setting if necessary and not re-use that one.

During development, I was also able to get the content of system files via symlinks in Nextcloud. However, that issue would be already resolved as soon as the server will not actually follow symlinks anymore (although e.g. 3rd party apps could still dereference them and thus compromise the file). Currently, I try to continue with actual symlinks on the server and get all the necessary steps to work and find the correct files that need to be modified (upload, download, server query).

Once that (at least kind of) works, I'll look into improving it by using metadata-only files ("virtual files" means something different in Nextcloud context I think: https://nextcloud.com/blog/nextcloud-desktop-client-3-2-with-status-feature-and-virtual-files-available-now/).

basos9 commented 7 months ago

Hello, I am commenting for the final solution's specs, please don't stop your good work at what you're in. I've checked your commits and it seems that you've taken it seriously!.

Fair point, could make sense to add another setting if necessary and not re-use that one.

I think this is not a good approach. Since there is functionality that relies upon server's data folder having symlinks (via the localstorage.allowsymlinks) we should not allow server code to create sym links.

And in general, there is not a functional requirement to actually store a symlink inside the server's data dir.

Once that (at least kind of) works, I'll look into improving it by using metadata-only files ("virtual files" means something different in Nextcloud context I think: https://nextcloud.com/blog/nextcloud-desktop-client-3-2-with-status-feature-and-virtual-files-available-now/).

Yes, let's then do not use this term. What I mean is to create meta files that would be interpreted as symlinks. The best would be for these files to not exist at all in server's data dir, rather they are somehow stored in the database. Or if this breaks the logic where every db entry (I think in oc_filecache) should have a real file (for .e.g occ:scan to work ) we could allow for empty files to be created in server's sync root for every client sym link.

taminob commented 7 months ago

Just wanted to give you all a short update because I made some progress after experimenting with different server representations. In the end, I settled on a regular file in the filesystem containing the symlink target. Additionally, to determine if a file is a symlink, I added a new table to the database (oc_symlinks). Using the existing oc_filecache to store the type as e.g. mimetype has a couple of disadvantages like that occ files:scan --all will overwrite any manually set mimetype. Also, I would have a pretty bad feeling with storing essential data in something called "cache".

Using this, I successfully synchronized basic symlinks to and from the server (PROPFIND, PUT, POST, GET and DELETE). However, there are still lots of bugs left which I'll have to fix one by one. Like that the symlinks are getting synchronized every time or that the symlinks aren't getting deleted from the database. Writing tests, cleaning up the code and PRs and maybe indicating symlinks in the Web UI are also still on my TODO list.

taminob commented 7 months ago

Next update - I resolved the issues mentioned before (and lots of others as well). Using the modified server and client, uploading/downloading/deleting/renaming of symlinks seems to work.

I mentioned the currently known limitations in the PR for the server (they all only affect the server side), namely:

Additionally, I tested nothing on Windows so far (might not even compile) - most of the implementation would actually also work for .lnk files (so "shortcuts") since Qt handles shortcuts as Windows' symlinks. If that or native NTFS symlinks should be used, might require further discussion since it will be hard to change once it is released with either option.

@basos9 if you're still interested in trying out the changes, I'd appreciate any feedback - I am sure that I missed a lot of bugs since my testing did only cover the most basic cases so far.

f1d094 commented 7 months ago

@taminob: First - thank you for your hard work!

Second: "symlink target as their content" sounds a lot like dereferencing the symlinks. I'm gathering that you mean a file that is a special on the serverside, and has the client-side link name as the filename and the contents server-side are just the text of the path of the actual symlink, or something similar? You may want to clarify in your comments, "symlink target" has specific the connotation of being the actual file that the symlink points to.

Also a quick thought: When saving the symlinks, be sure to use relative vs full-path symlinks as they were created on the source system.

Forgive me if this was clarified previously. I've not had the bandwidth to follow the developments and conversation over the past few weeks.

taminob commented 7 months ago

Second: "symlink target as their content" sounds a lot like dereferencing the symlinks.

@f1d094 thank you for pointing out the ambiguity. "symlink target path" is maybe more accurate, the symlinks are not dereferenced - so e.g. broken symlinks can be synchronized as well. Also, the "raw" symlink is uploaded, so if it is relative, it remains relative and if it's absolute, it remains absolute. The value returned by "readlink" is used.

AkechiShiro commented 2 months ago

What's currently stucking this feature from rolling out ? There are two PRs right now #6205 (client) nextcloud/server#41321 (server side)

AkechiShiro commented 2 months ago

Given this was discussed since 2018, such a feature missing is a bit sad.

AkechiShiro commented 2 months ago

I feel like the client Nextcloud has lacks a lot of polished feature, this does not feel like there is a company behind.

Testing of feature such as file syncing seems to not even be implemented or properly handled. As files do not sync with the latest client from upstream.