syncthing / syncthing

Open Source Continuous File Synchronization
https://syncthing.net/
Mozilla Public License 2.0
64.03k stars 4.24k forks source link

Ignore some files by default? #837

Open gunwald opened 9 years ago

gunwald commented 9 years ago

I would suggest to introduce a global ignore list, that has some default entries. For example: .git/ # ignore git repros .sparkleshare/ # ignore git sparkleshare .~lock. # ignore LibreOffice lock files .owncloudsync # ignore owncloud

It is annoying to add to each folder on each node every time all files you want to be ignored.

calmh commented 9 years ago

I'm not convinced. I see how that's useful in your case, but what about the hundreds of other kinds of files people might want excluded (temp and lock files for every other office package, .svn if you include .git, etc). Also, I do want my .git folders backed up for sure. I think this is a case where there isn't a list that will make everyone happy, and just leaving it up to the user in all cases is simpler.

Goobles commented 9 years ago

I don't think he meant THAT globally, OP means having a global configurable ignore list that applies to all folders of the specified node.

gunwald commented 9 years ago

Ok, you maybe right. But then it would be useful to have a global ignore list anyway. May be without default entries.

AudriusButkevicius commented 9 years ago

ln -s from your other folders.

gunwald commented 9 years ago

Ok, but you have to do this for all folders and on all nodes and that is much work.

kozec commented 9 years ago

I would like this as well. Vision of going around 6 machines * 8 repos just to ignore .pyc, .class and bin/ makes current method of ignoring files pretty-much unusable for me.

AudriusButkevicius commented 9 years ago

There already is a #include directive, which means as long as all of your repos/folders include the same file, it will be global.

You can also keep the file synced between machines by having it in one of your folders.

I do not see ignores shared between devices in any other way.

I do perhaps see a global ignore list which gets added on top of each folders ignores as a useful feature, but the case is so niche and specific to a certain group of people who can already solve/automate it in other means just because they can.

There will only be the initial cost of setting the include on all your devices and all your folders, but I am sure you can automate this with a bash one-liner.

kozec commented 9 years ago

But #include practically means that I have to go around 6 machines * 8 repos to #include some synchronized file :)

And automating that it's harder than it looks - i have Windows machines (no bash there), Android devices (no ssh there)... Having something like .stglobalignore #included by default would solve this easily and it doesn't sound hard to implement (to someone who never coded go, ofc)

AudriusButkevicius commented 9 years ago

You can write a small go script, cross compile it to all platforms, spread it using syncthing, and then invoke it on each machine ;)

kozec commented 9 years ago

Yeah, and then I can go around 6 machines and run that script :D

By the way, as I already said, I don't speak go, but based on what I caught from sources, wouldn't this two-liner enough to add new, auto-included ignore file?

http://pastebin.com/xDApEVL3 (model.go)

calmh commented 9 years ago

Yeah, so I responded to the "default entries" and not really to the "global ignore list" part. A global (as in it applies to every folder on a given device) ignore list I would be OK with. It would need to be stored in the config (just as an implementation note).

calmh commented 9 years ago

@kozec Theoretically you could automate it through syncthing though, today, using a POST request to the admin interface to set the ignore list per folder. @AudriusButkevicius that should probably be documented in https://discourse.syncthing.net/t/the-rest-interface/85.

pepa65 commented 9 years ago

I like the fact that Syncthing doesn't litter the directories that are to be synced. Therefore I don't like it that .stignore files have to be put into the directories. Of course I am free not to use that feature, but could it be possible to have an ignore-file somewhere in the .config/syncthing? It could also serve as a global-ignore file as per the topic of this thread.

Zillode commented 9 years ago

I'm also not convinced about the .git and .sparkleshare ignores as I also sync my .git folders. However, OS-specific temporary files should be filtered imo as they generate conflicts outside the user's actions (as proposed in https://github.com/syncthing/syncthing/issues/1055).

pepa65 commented 9 years ago

A .stglobalignore or somesuch outside of the actual folders-to-be-synced would be better for my use case. I really don't like to need to litter my folders with .stignore files.

Zillode commented 9 years ago

I think we should reconsider this. We should ignore operating system files that are known to cause conflicts as soon as users start using Syncthing. My current list of such files consists of:

".DS_Store", // OSX (https://github.com/syncthing/syncthing/issues/826)
".Spotlight-V100", // OSX
"Thumbs.db", // Windows
"lost+found", // Linux (https://discourse.syncthing.net/t/un-ignore-subfolders-files-of-ignored-folder/1460/10 , https://github.com/syncthing/syncthing/issues/1090)

I suggest to list these as the default .stignore content.

AudriusButkevicius commented 9 years ago
"lost+found",
".DS_Store",
"Thumbs.db",
"._*",
".Spotlight-V100",
".Trashes",
"desktop.ini",

@Zillode's list make sense, I think we should have a default set of OS specific things (rather than .svn et al) @calmh ?

facastagnini commented 9 years ago

I build my exclusion list using the expressions relevant to my OS from here:

http://support.code42.com/CrashPlan/Latest/Troubleshooting/What_Is_Not_Backing_Up

I think is a comprehensive list.

On Sat, Feb 28, 2015, 09:42 Audrius Butkevicius notifications@github.com wrote:

Reopened #837 https://github.com/syncthing/syncthing/issues/837.

— Reply to this email directly or view it on GitHub https://github.com/syncthing/syncthing/issues/837#event-243767016.

generalmanager commented 9 years ago

I was going to propose my own list, but the CrashPlan one is way more comprehensive. Nice find @facastagnini !

facastagnini commented 9 years ago

You are welcome!

I started using Syncthing and Crashplan pretty much at the same time, so when I was building a common ignore list for both tools I stumble upon that.

Since we are linking that list I feel it is appropriated to thanks them for sharing that comprehensive ignore list, and give them some credit saying that Crashplan is an awesome backup tool that complements my Syncthing setup very well.

flungo commented 9 years ago

I am looking to implement this, if it hasn't already been started. Not sure if it would be best to have this hard coded into the binary, or if to create the global ignore list suggested populated by default with the system generated files. The global ignore list would probably be best located in the user config directory (alongside the config.xml).

I think as a minimum the list should include:

.DS_Store
.DS_Store?
._*
.Spotlight-V100
.Trashes
.Trash-*
Icon?
ehthumbs.db
desktop.ini
lost+found
Thumbs.db

I feel that the "comprehensive list" is a little too comprehensive, in the sense that, that list was written for full system backups.

generalmanager commented 9 years ago

@flungo It would be great if you found the time to do this. There are plans for global ignores in https://github.com/syncthing/syncthing/issues/1392. But as was already discussed over there those global ignores would not be synced to a master device for security reasons.

I would suggest a list of files in the main config file. The files in this list are added to the automatically created .stignore file in every folder at creation.

This would make it clear to everybody wondering why some files don't sync AND if someone really wants to change this it can either be done for all future folders on a device (by changing the config) or on a per-folder base.

Seems like the best solution to make it obvious and configurable.

Edit: regarding your list:

I'm not sure about "Icon?". Which OS creates those? And I think it might be a little too broad, not sure how often people create files matching that.

Zillode commented 9 years ago

I agree with @flungo (wrt "comprehensive list") and @generalmanager for the implementation. Feel free to initiate a PR @flungo

flungo commented 9 years ago

@generalmanager that definitely seems like the best idea actually. You're right that if it was global it would cause problems and potential confusion. I will see what I can come up with - doesn't sound too complicated a task though.

To be honest, with regards to the global exclude lists, I myself am quite happy to use symbolic/hard links (depending on if I want the global list to sync or not) and then using a #include statement in the folders .stignore.

If anyone sees any problems with the list I posted above then feel free to let me know what you think should be changed/added.

generalmanager commented 9 years ago

@flungo

To be honest, with regards to the global exclude lists, I myself am quite happy to use symbolic/hard links (depending on if I want the global list to sync or not) and then using a #include statement in the folders .stignore.

Yep, me too. But most users don't know what soft- or hardlinks are. And it doesn't prevent malicious nodes from introducing unwanted changes.

With regard to your list: I think it's a good base to work with, but where does "Icon?" come from? It's not in the CrashPlan list and might be broader than necessary.

flungo commented 9 years ago

@generalmanager

So the Icon? is a file sometimes found on OSX machines. It is the file created if the user has set a custom folder icon in finder. The ? is actually matching a \r. Could cause an issue if someone made a file called "Icon" or "Icons" I suppose...

generalmanager commented 9 years ago

@flungo Yeah, I read that \r was used instead of \n before OS X. We should definately test this. Would suck if Syncthing accidentally ignored Icon or Icon*.

calmh commented 9 years ago

Icon\? should match precisely that filename and not Icons etc.

flungo commented 9 years ago

@calmh Would that not match a literal \ as opposed to \r (as in the return carriage symbol)

For config how does this sound:

<configuration version="8">
    ...
    <defaults>
        <ignores>
            <ignore>.DS_Store</ignore>
            <ignore>.DS_Store?</ignore>
            <ignore>._*</ignore>
            <ignore>.Spotlight-V100</ignore>
            <ignore>.Trashes</ignore>
            ...
        </ignores>
    </defaults>
</configuration>

The <defaults> section could be gotten rid of but I feel this would provide a nice expansion to being able to set other defaults when adding a new folder (such as a default versioning scheme). Realistically the <ignores> could be dropped too, but I feel that would start getting messy.

@generalmanager?

calmh commented 9 years ago

No, that matches a file called Icon?, with the question mark at the end. The backslash makes sure the question mark is interpreted as a literal question mark and not as a one character wildcard. (I'm not sure what the \r part is about?)

flungo commented 9 years ago

@calmh I was using it as a one character wildcard. On Mac OSX it uses an actual \r return carriage at the end of the file name. I believe in the terminal, it will be displayed as a ? but the filename is Icon\r.

I will leave this one out until it can be confirmed.

calmh commented 9 years ago

Ah, yes, you're right, the file is actually called Icon\r. That file is the WTF that just keeps on giving... :) That's probably not possible to match with the current ignore language other than as Icon? which as you say is too wide.

flungo commented 9 years ago

.DS_Store? Is actually trying to match the same thing (with the ? being the \r character). Although I think its unlikely that someone will choose to make a file that's name matches .DS_Store?. The non question marked one is for on windows (possibly unix too) after you extract a zip that had a .DS_Store\r folder it turns it into just .DS_Store: Apple clearly thought it would be a great idea to distinguish their generated files by appending a seemingly invisible character to the end of the files. sigh

flungo commented 9 years ago

For reference/review, my current development branch: https://github.com/flungo/syncthing/tree/feature/issue/837

I will submit a PR when the feature is complete. If you see any problems though let me know so I can fix before making the PR.

calmh commented 9 years ago

The .DS_Store file is just that on Mac, no \r or anything. That particular madness is reserved for the Icon file, for historical reasons probably.

flungo commented 9 years ago

Okay so this feature is mostly complete: just needs testing.

One other possible thing to implement is whether during the upgrade, the new default ".stignore" should be added to existing folders that do not have a an ignore file already?

generalmanager commented 9 years ago

@flungo Wow, great work! I would personally prefer, if those were appended on upgrade, but I think we should leave it to @calmh to decide this.

AudriusButkevicius commented 9 years ago

This is great stuff. The only idea I have is that perhaps these should be (pre/app)ended to the list after we've read from .stignore. This way they would always be there, and you wouldn't need to go and change them everywhere after you changed them in the cfg.

flungo commented 9 years ago

I was thinking about the pre/post append. If that were to be done how would it distinguish between purposefully modified ignore files and . It would be fine in the upgrade process to prepends the new defaults to the top and if a person modified it

But then yes, if you wanted to modify it slightly, it would drive you crazy going through every folder you had synced and changing this. It could be done similarly to how wordpress manages the .htaccess file my putting a marker (such as \\---BEGIN SYNCTHING DEFAULTS---\\ and \\---END SYNCTHING DEFAULTS---\\ around the generated section, but then if this is removed after creation/upgrade then it will no longer update this during the folder checks.

One other potential idea I had was if the config file stored the ignore list (possibly multiple of them identified by an attribute) which could be referenced from the syncthing .stignore with some new syntax (such as @include defaults). In one way this would be great as it could centralise the management of these ignore lists and allow a user to define custom ones (like a git one, or a owncloud one - giving the OP a solution that would work for their needs) that can be added to any .syncignore. So to reiterate for clarity #include would include from the local directory, whereas @include (or similar) would do a lookup in the config for an ignore list.

Let me know what you guys are thinking. I will try and do some testing in the meanwhile.

As a side note: What have you other developers/collaborators found is the best way to test? I will see if there any relevant unit tests that need to be added then probably set up a docker image that I can create and dispose of quickly to test various start conditions and check that an upgrade works, etc.

AudriusButkevicius commented 9 years ago

There are a bunch of tests which test ignore files, as well as integration tests to make sure stuff still works. The whole #include vs @include is more or less redundant I think, because you can already achieve what you are talking about with file includes.

The only benefit of @include over #include is that you can reuse it in multiple folders, but it still involves a manual step of adding @include defaults, which I guess what this ticket was trying to avoid. (As well as having a good basic set of defaults).

I don't mind having @include and #include, but I'd prefer to avoid having people to add @include defaults.

Perhaps we can have @include and #include, but then always silently include @include defaults?

Also, next feature I see people asking for is synced @include's which is #1408 and #1392

flungo commented 9 years ago

With regards to the @include suggested - I would suggest automatically and silently adding it to the start of every file (during upgrade and when creating a new folder) with a good comment above about its purpose and where it can be edited and a reference to the documentation. Similar to what I have already implemented here

However if the user chooses to remove it the @include default line, then we assume they know what they are doing and it should stay removed.

AudriusButkevicius commented 9 years ago

Makes sense.

generalmanager commented 9 years ago

@flungo I agree, seems like the cleanest and most logical way to me.

flungo commented 9 years ago

Just going to post these links here for reference:

https://github.com/github/gitignore/blob/master/Global/Linux.gitignore https://github.com/github/gitignore/blob/master/Global/Windows.gitignore https://github.com/github/gitignore/blob/master/Global/OSX.gitignore

Not all relevant but its pretty obvious which ones we do and don't want from each list.

flungo commented 9 years ago

Hectic week with work, hoping to have this finished by next weekend at the latest. Sooner hopefully though.

counterbeing commented 9 years ago

Right on! This seems very much needed. I think a ~/.stignore would be perfect. Just file separated by lines containing patterns. Simple and to the point, just like .gitignore.

Thanks! Just got into SyncThing, and it is awesome!

sysfu commented 9 years ago

Any status update on implementation of the global ignore feature? I just started syncing between OS X and Windows and the hidden file litter is driving me crazy.

AudriusButkevicius commented 9 years ago

Read the manual/docs, there is already an include directive which allows doing that.

sysfu commented 9 years ago

Look, I'm all too familiar with the tedious and error prone manual method of using include statements.

The issue the original poster raised was that of shipping a global ignore list "that has some default entries".

The list suggested by @flungo below is a fine one. So how about it? How about shipping this global list and enabling it by default with new installations? And also exposing it in the web interface for editing?

The people with text editing fetishes will still be quite free to edit/include/link all the myriad of .stignore files they want and religiously check them into their version control systems.

.DS_Store
.DS_Store?
._*
.Spotlight-V100
.Trashes
.Trash-*
Icon?
ehthumbs.db
desktop.ini
lost+found
Thumbs.db
AudriusButkevicius commented 9 years ago

It's already editable from the web ui? I am not against a default list, feel free to make a pull request.