haiwen / seafile

High performance file syncing and sharing, with also Markdown WYSIWYG editing, Wiki, file label and other knowledge management features.
http://seafile.com/
Other
12.28k stars 1.54k forks source link

symlink support for Linux #288

Open makomi opened 11 years ago

makomi commented 11 years ago
seafile server v1.7.0.1
client v1.7
found no log entries related to the issue on my system

Issue Creating a symlink with "ln -s" creates a symlink on my local machine, but everybody I am sharing the link with gets a copy of the actual file instead of a symlink. I think this is related to issue #130.

Is following symlinks the wanted behavior, i.e. is this a feature and not a bug?

Why I need this feature In many cases I find it difficult to structure my data sufficiently just by sorting it into folders, because e.g. it would be easier to browse my data if pictures in one folder would also be accessible in another folder.

xarinatan commented 5 years ago

@michaelcadilhac OOF you're making a good point there, I was probably compiling master. rip :skull: Trying again right away... Yep git status shows it's origin/master. D'OH :man_facepalming: .. I should've confirmed the files were actually modified from the start. So I switched branch/checked it out, I recompiled it aaaaand.... it seems to be working! :tada: :cake: In the webinterface/cloud file browser they show up as small files without extension with the name of the target file inside them as text, which is exactly what I needed! I do think this will break symlinks when I'd generate a zip file from the webinterface, but that's such an edge case for me that it's easy enough to keep it in mind to just use the seafile client with a patched daemon for syncing the files.

For anyone else reading along, here's the full version:

Thanks again @michaelcadilhac ! I really hope the seafile crew considers merging this patch really, I think it's pretty solid the way you've solved it with what would seem to be the proper routes to do so, the only drawback is that it doesn't work on Windows in its current state, which might need some additional work, but for now this is a good start IMHO, would probably please a lot of system admins to be able to use Seafile as a true backup system for servers and appliances and what not else runs *nix these days. Oh and that 50 euro bounty is also still open ;) perhaps it'd be fair to give some of that to @michaelcadilhac .. otherwise you could just put a paypal address here for other people to donate to, I'm sure this pleases a lot of sysadmins and linux users besides myself. :bowing_man:

MurzNN commented 4 years ago

@xarinatan can you check and describe how those symlinks are represented in Webdav, web interface and SeaDive mounted folders?

shoeper commented 4 years ago

As files without extension containing the target path.

ygramoel commented 4 years ago

FYI: Dropbox announced support for symlinks today. https://help.dropbox.com/installs-integrations/sync-uploads/symlinks

shoeper commented 4 years ago

The post says it is from mid 2019, so not new.

andifunke commented 4 years ago

Just started with Seafile and I'm a bit surprised to learn about this now 7 years old issue. Seafile still seems to follow symlinks in 2020. Is there a TL;DR for the state of symlinks in Seafile 7.1.x with respect to development plans and/or workarounds?

KarlZeilhofer commented 4 years ago

I use this seafile-ignore.txt file in the root ouf our synced directory:

# see https://www.seafile.com/en/help/ignore/

# symlinks werden von seafile synchronisiert, indem alle Datein in 
# dem Zielverzeichnis synchronisiert werden. 
# Dieses Verhalten ist of unerwünscht. 
# In unserem Fall sind symlinks zu den git-repos oft hilfreich. 

# Alle datein und Ordner, die mit "dontsync" oder mit "symlink" beginnen oder enden, 
# werden mit folgenden Regeln ausgeschlossen:

*/dontsync*
*/*dontsync
*/dontsync*/
*/*dontsync/

*/symlink*
*/*symlink
*/symlink*/
*/*symlink/

# Git Repos sollten eh in einem eigenen Pfad liegen, doch manchmal eben nicht. 
*/.git

So files starting or ending with symlink will not be synced.

andifunke commented 4 years ago

I don't think I will be able re-arrange my paths so that all symlinks can follow a common naming scheme. Guess I'll have to include them all manually to the seafile-ignore.txt, which is obviously quite cumbersome.

# Git Repos sollten eh in einem eigenen Pfad liegen, doch manchmal eben nicht. 
*/.git

As a side note: I don't think this line will be effective as folders need to end with a /. Just tested this and as far as I can tell, only */.git/ would work as a pattern for git folders.

KarlZeilhofer commented 4 years ago

Thanks for that hint!

gmanley commented 4 years ago

I, for one, use symlinks to link folders outside the Seafile folder so that their contents are synced. It lets me sync items while still having them exist outside the Seafile folder. How would I do that without this current symlink behavior? Would hard links work?

Fixing this isn't just a matter of changing the behavior. There needs to be some sort of way to roll this out without just silently breaking the existing behavior as some people do use symlinks for this specific use case. While it may seem like consensus in this thread that everyone just wants the behavior changed, there's probably some selection bias going on.

Dropbox has been around a lot longer than Seafile, and they just made the change about a year ago. I think that goes to show it's not just a no-brainer change and requires some thought on how to migrate to the new behavior. Their document explains how they make a backup copy of the last contents of the symlink and then deselect it from sync.

dirdi commented 4 years ago

@gmanley A file is a pointer to a location on disk. A symlink is a pointer to a file. A hardlink is a pointer to a location on disk, i.e. a file! So yes, hardlinks would work and therefore could replace - most probable - symlinks for your use case. However, hardlinks are unable to cross filesystem boundaries. However, I think using bindfs would be a much more elegant and clean solution in your case, anyway.

gmanley commented 4 years ago

@dirdi Right, I understand the distinction. But from my understanding, hard links generally don't work with folders, correct? I'll take a look at bindfs, thanks.

dirdi commented 4 years ago

@gmanley you can not create a hardlink to a folder, that is true. But I doubt there is reason for you to hurry finding an alternative solution. This bug is over 7 years old and was the very reason I replaced seafile years ago.

dreua commented 4 years ago

@gmanley You can just use a symlinks in the reverse direction. Put the files in Seafile and link to files/folders from wherever you need them.

xarinatan commented 4 years ago

@gmanley multiple solutions actually, for one inside seafile you could make separate repositories in one account, and if you really want everything to be in a single repo, you could instead make a repository and create folders in that repository, which you then individually sync with the folders you want (sub-repositories).

You could also use hardlinks, although I did realize that folder hardlinks do have issues because the filesystem would get in the same kind of loops that Seafile gets when its trying to parse recursive symlinks the wrong way. That's why you use symlinks as symlinks and not hardlinks....... Still there should be ways to do it if you really have to, you can definitely force it (Linux: ln -F, Windows has 'directory junctions' which you could use), but beware of the recursion issues that could cause your filesystem to hang the same way Seafile does when it meets a recursive symlink (should only be a problem if you refer to a folder higher up in the same tree).

The entire reason Seafile has multiple repositories (and sub repositories) is so you don't NEED to use backwards hacks like this to aggregate data into a single account.. I much love that part of Seafile. And that's why I really don't get why they still had to use symlinks this way, when they're not supposed to be used this way, as they already solved the issue of not using it that way, by having multiple repositories per account..

Finally I don't see why we can't just have both versions by having a checkbox in the settings that decides how symlinks are handled. I am 100% fine if the DEFAULT behavior is broken for compatibility reasons, but being completely unable to fix it without having to resort to a patch that was written by a good hearted samaritan that simply brought out a feature that seems essentially already in place, is IMHO not a healthy way to deal with it, and am I still considering this to be a large deficit/bug in an otherwise amazing product.

monotok commented 3 years ago

Would be great to have the option to ignore symlinks!

ikcalB commented 3 years ago

@gmanley you can not create a hardlink to a folder, that is true. But I doubt there is reason for you to hurry finding an alternative solution. This bug is over 7 years old and was the very reason I replaced seafile years ago.

What did you replace seafile with?

dirdi commented 3 years ago

@ikcalB: Since I only need fast and reliable file sync (including symlinks) and no Wiki, Office, Markdown, other crap nobody really asked for ... I use syncthing. Just install it on two (or more) devices, scan a QR code (or type a code) and it just works. You do not even have to set up a server.

The funny thing about this is, that there was a bug report https://github.com/syncthing/syncthing/issues/1776 over at the syncthing repo where smb. asked to follow symlinks (like Seafile does), but they pretty soon figured out that this is a bad idea and that one who really needs this feature, can still achieve this behavior by leveraging a bind mount.

MurzNN commented 3 years ago

Syncthing is cool, but it isn't support "sync on demand" feature, that implemented in SeaDrive client, so this is killer-feature of Seafile and I can't find any alternative to it... :-( Second missing feature in Syncthing is Web access and sharing of single file or directory for even anonymous access.

xarinatan commented 3 years ago

It feels funny to talk about alternatives here but, NextCloud is a pretty widely accepted alternative, including the web access and sync client, plus a whole heap of other features, I've been considering switching back to it (I used it back when it was still called OwnCloud), but I really like the lean and mean long term git-like history that seafile has, combined with really sweet performance at a scale, I just wish they made the live garbage collector available in the opensource version, and fixed that incredibly annoying symlink bug that still makes Seafile look silly. I mean come on how is hanging and using 100% CPU at all a desirable state, and the symlink support is almost in there? It seemed to be really easy to patch the feature on the client end strangely enough, but perhaps they run into some issue that prevents them from rolling it out? I can only think of reasons which imply using software in general the wrong way though, but maybe I'm mistaken. I'm just salty to have this wart of a bug on an otherwise basically flawless product.

edit: for the seadrive feature, look into WebDAV, NextCloud and Seafile can both support that, it's somewhat similar to a network drive :)

monotok commented 3 years ago

@xarinatan Going off topic a bit but wanted to highlight. I run both Nextcloud 19 and Seafile on my server in lxd containers with attached storage over ISCSI from Freenas. Both solutions have their advantages and disadvantages as you probably know.

However I wouldn't recommend Nextcloud for syncing a lot of small files or even thousands of photos as its sync is incredibly slow compared to seafile. I am talking extreme differences here for some scenarios; a test I did was something like 4000 odd files (cherrytree notes git project and 1000 photos) and seafile was something like 12 seconds and nextcloud was over 7 mins. I commented my test on this issue https://github.com/nextcloud/server/issues/16726 regarding owncloud switching to Go but nextcloud seem more interested in adding stuff like dashboards, talk clients etc than fixing their products primary purpose of syncing files.

Also I always got loads of sync conflicts probably because don't do delta sync.

Not meant to be a nextcloud bashing as they both have their strengths so I use both; seafile for syncing laptops together while nextcloud I use for mainly webdav apps on android and auto uploading photos.

xarinatan commented 3 years ago

@monotok yea exactly, Seafile's performance has always been my favorite part about it, you can run a fully functional server of it on a raspberry pi and expect it to perform well enough to keep a family's photo albums and projects and schoolwork etc backed up and shareable, and I didn't specifically care about the extra features that Nextcloud offered except maybe their well developed calendar.. So far Seafile has been my favorite, but it can be a real pain in the ass with those specific quirks around symlinks (and slow offline garbage collection if you use the community edition).

ikcalB commented 3 years ago

what about a reward at bountysource? I for one'd be willing to contribute (hmm)?

michaelcadilhac commented 3 years ago

For what it's worth, my temporary solution in this thread (https://github.com/haiwen/seafile/issues/288#issuecomment-509011949) is still working. It's a handful of commits behind master, though.

dirdi commented 3 years ago

@ikcalB since no consensus has been reached how the current behavior should be alerted, it is unlikely that smb. is able to provide a patch that gets merged, no matter if there is a bounty or not.

BTW I have heard about some portals that have alerted their TOS and bounties that have not been awarded within X month will now expire without a refund to the sponsor.

kong13661 commented 3 years ago

hxd,这个新功能还会加吗?7年了= =。太困扰了,加一个链接同步十几G文件

ikcalB commented 3 years ago

@michaelcadilhac thanks for reminding me of your fork!

if that is still working for you, I might as well try it out.

liam-k commented 3 years ago

There really should be an option where you can choose whether to follow symlinks or not, ideally at client level. It can create problems with other software and recursion. An example:

Final Cut Pro X (video editing suite) uses "databases" with an .fcpbundle extension. They are presented as files on macOS, but they are basically folders and are treated as such on other OS'. Inside those folders, there’s an .fcpcache file which is a symlink to the .fcpbundle folder above it – so FCPX knows it’s supposed to save caches inside the main folder and not elsewhere, which is possible too (in that case, the symlink would point to another location).

Now what happens if you try to sync a Final Cut Pro Database via Seafile? This:

image

Seafile is now caught in a recursive loop. This goes on forever, hogs a lot of CPU, and, more importantly, uploads an infinite amount of copies of that library onto the server. Now, even if you set the cache to be outside the main database, it would still create a syncing problem, because Seafile would now, again, not sync the symlink to the location but the content of it, creating larger database folders than necessary and potentially breaking things when you open it in FCPX again. There’s no real solution to this except just not syncing the cache symlink – which sucks if you do want to use an external cache location.

This might be a special case, but FCPX is not the only software to use symlinks in its process, and, as this thread illustrates, there’s personal use cases for symlinks too. I haven’t encountered this problem with any other cloud software, and I don’t understand the decision behind it (I mean, this issue has been open since 2013, not giving that option must be a conscious decision by now). At least there should be a fix that handles recursion elegantly and either throws an error or doesn’t follow those links indefinitely till the last circle of hell...

linuxturtle commented 2 years ago

How can it be that this heinous bug still exists in the client after being reported/discussed for 9 years? I just ran into it because I rsync'd a directory which had a recursive symlink in it "archive -> ./" to a seafile-synced directory. The current behavior of following symlinks and treating them as regular files/directories is utterly insane! Even more insane is the behavior incurred when deleting a recursive symlink, as seafile deletes everything inside the directory (i.e. the current directory) too! Insanity!

liam-k commented 2 years ago

How can it be that this heinous bug still exists in the client after being reported/discussed for 9 years?

Seafile’s support is not bearable anymore at this point. I used to pay for the pro version but I now switched back to Nextcloud. Has it’s own set of problems (especially with large amounts of data) but at least it’s actually actively maintained and improved every day. It’s sad because I really liked Seafile and I find it a lot more usable than Nextcloud, but above all a cloud service needs to be reliable. Seafile has multiple completely no-go bugs which can make you lose data, corrupt or freeze libraries or simply make your life much harder like this symlink thing. It’s partly a community product so I’ve been patient with it, but there’s a limit.

linuxturtle commented 2 years ago

I used to pay for the pro version but I now switched back to Nextcloud.

I also have a nextcloud instance, and nexcloud just ignores symlinks. That's not as useful as doing something intelligent with them, or just syncing them verbatim to clients where they're supported, and ignoring them on clients where they're not supported (hey, just like rsync does!), but it's a LOT better than turning them into some insane kind of hard-link/symlink hybrid like seafile does. I honestly can't figure out any scenario where the current behavior is sane, especially with absolute-path symlinks. I saw one guy who uses the current behavior as a hack to sync stuff without actually putting it in a library, but that's the kind of hack that ought to be ruthlessly broken, not encouraged.