Closed corford closed 7 years ago
When working on a repository located on a samba originated network share, GfW wrongly converts symlinks to real files and directories causing git to incorrectly mark them as changed/modified and making it impossible to work with the repo.
This is typically triggered by the absence of symlink support. I.e. when core.symlinks = false
.
Thanks @dscho but I'm not sure that's the case here since core.symlinks
is set to true
.
Symlinks work without issue if the repo is checked out to a local path on the Windows machine (e.g. C:\gitrepos\myrepo). They only confuse GfW when the repo is checked out on a linux guest and that location is made available to GfW via a samba share + mapped drive combo e.g. Z:\gitrepos\myrepo)
Z:\gitrepos\myrepo> git config --list
core.symlinks=true
core.autocrlf=false
core.fscache=true
color.diff=auto
color.status=auto
color.branch=auto
color.interactive=true
help.format=html
diff.astextplain.textconv=astextplain
rebase.autosquash=true
http.sslcainfo=C:/Program Files/Git/mingw64/ssl/certs/ca-bundle.crt
diff.astextplain.textconv=astextplain
filter.lfs.clean=git-lfs clean -- %f
filter.lfs.smudge=git-lfs smudge -- %f
filter.lfs.required=true
filter.lfs.process=git-lfs filter-process
credential.helper=manager
core.symlinks=true
core.filemode=false
user.name=Charlie
user.email=charlie@localhost
alias.co=checkout
alias.br=branch
alias.ci=commit
alias.st=status
They only confuse GfW when the repo is checked out on a linux guest and that location is made available to GfW via a samba share + mapped drive combo e.g. Z:\gitrepos\myrepo)
Maybe it is a permission issue, where somehow the permission to create symlinks does not permeate through the Samba connection? Are other symlink-creating commands working?
Yes, other symlink commands performed outside of GfW work fine. For example, on the Windows host I can successfully:
1. Open Windows Explorer
2. Navigate to Z:\gitrepos\myrepo
3. Drill down further to a child directory that contains a symlink to a file
4. Right click on that symlink and open it with a text editor (e.g. Sublime)
5. Change a line within the file and save the changes
6. Return to Windows Explorer, locate the target file the symlink was pointing to in the same repo
7. Right click and open the target file with a text editor and see the changes I made in step 5 above
So, outside of GfW, symlinks are working and behaving as they should, even across a samba share. I suspect the bug is with how GfW internally handles symlinks for resources on mapped network drives.
EDIT: I just tried creating a symlink using git-bash.exe (bundled with GfW) and hit the same problem. Rather than creating a symlink, it creats a new file (containing the contents of the target). There seems to be a bug with the type of resource GfW (or the bash shell bundled with it) creates when asked to make a symbolic link to a file on a network share. Instead of creating a link, it creates a copy of the file.
cd /z/gitrepos/myrepo
ln -s README.md test.md
ls -l
-rw-r--r-- 1 Charlie 197121 24 Jun 3 23:54 README.md
-rw-r--r-- 1 Charlie 197121 25 Jun 13 23:51 test.md
@dscho I finally had some time last night to try and debug a little further. Instead of samba, I setup an NFS server on a vagrant Ubuntu guest and exported a gitrepo back to the Windows 10 host. On the windows host, I mounted the NFS share and assigned it a drive letter. As with samba, everything worked fine (opening, editing, deleting, creating files). The only issue was symlinks. GfW exhibited the same bug as described above (i.e. it clobbered them in to real files).
So it looks like GfW (or the bash shell bundled with it) has a bug with how it handles symlinks on UNC paths. Basically, it doesn't and just treats them as a copy of whatever the original symlink target was.
It would be awesome if this bug could be tracked down and fixed as it's the only thing stopping (imho) a seamless Win/Linux dev setup. By keeping the source code on a linux guest vm and exporting it back to windows via a network share, you completely bypass the usual issues of file permission mismatches and lack of inode notifications for hot reload/compilation while at the same time not needing to access your editor (e.g. vscode) over a slow, flakey X tunnel. You literally get the best of both worlds.
I just tried creating a symlink using git-bash.exe (bundled with GfW) and hit the same problem. Rather than creating a symlink, it creats a new file (containing the contents of the target)
Please note that the Bash and Git have very different code paths to handle symlinks.
The best way forward may be for you to install the Git for Windows SDK and instrument the code in /usr/src/git/compat/mingw.c
to see what flags are obtained by mingw_lstat()
(and how the file_attr_to_st_mode()
function handles them).
Of course, it is altogether possible that the has_symlinks
variable is somehow forced to zero.
The last time I touched C code was 17 years ago at university so instrumenting it myself is not going to be easy :)
If you can give me a hacked up version of mingw_lstat() to copy and paste over the original in /usr/src/git/compat/mingw.c I'm happy to compile, test and report back for you (I've just grabbed and setup the SDK and I should be comfortable following https://github.com/git-for-windows/git/wiki/Technical-overview to make a new build).
I guess all that might be needed is a few "print to log file (or stderr?)" lines somewhere in the function so that we can get a readout of the flags and what file_attr_to_st_mode() returns?
The last time I touched C code was 17 years ago at university
So it would not take much to brush up your C fu.
I guess all that might be needed is a few "print to log file (or stderr?)" lines somewhere in the function so that we can get a readout of the flags and what file_attr_to_st_mode() returns?
Yep, exactly. something like
error("this is my flag: %x", (int)flag);
(The error()
function is defined by Git and it is shorter to type than fprintf(stderr, ...)
, plus: error()
appends a newline itself.)
Then just call make -j5 install
(using up to 5 parallel processes) and after that you can call git ...
.
Alternatively, you can build without installing (make -j5
) and use in-place: /usr/src/git/git --exec-path=/usr/src/git ...
.
Thanks for the hints 👍 I've successfully re-built without installing and can get debug lines printed (here's the hacked version of mingw_lstat() I used: https://gist.github.com/corford/7a9c078fff8a8e130215f2d0568c185f ).
Now I'm not sure what git command to use so it will iterate over the files and dirs in a repo and print the attributes? I tried this /usr/src/git/git --exec-path=/usr/src/git ls-files /z/myrepo/
but it only printed debug lines for .git/HEAD and nothing else...
error: getting attribs of: Z:/myrepo/.git/HEAD
error: this is my flag: 8180
error: getting attribs of: Z:/myrepo/.git/commondir
error: getting attribs of: .git/commondir
error: getting attribs of: .git/commondir
.gitconfig
.gitignore
.vscode/settings.json
README.md
...
Ok got it. I used /usr/src/git/git --exec-path=/usr/src/git log myrepo/path/to/a/symlink
to get the print outs.
When issued against a repo on a local disk, the attribute for a symlink that points to another directory in the repo is: a180
and the attribute for a symlink that points to another file within the repo is: a180
When issued against the same repo (but this time located on a mapped UNC path), the attribute for a symlink that points to a directory is: 4180
and the attribute for a symlink that points to a file is: 8180
So it looks like either file_attr_to_st_mode() or fdata.dwFileAttributes is wrongly reporting symlinks as real files and directories when the symlink (and its target?) is located on a mapped UNC path.
When issued against a repo on a local disk, the attribute for a symlink that points to another directory in the repo is: a180 and the attribute for a symlink that points to another file within the repo is: a180
Well, 0xa180 = 0x8000 + 0x2000 + 0x100 + 0x80. Whereas 0x4180 = 0x4000 + 0x100 + 0x80, and 0x8180 = 0x8000 + 0x100 + 0x80.
So the difference is only in 0x8000, 0x4000 and 0x8000 + 0x2000.
That is funny. According to https://msdn.microsoft.com/en-us/library/windows/desktop/gg258117(v=vs.85).aspx, 0x8000 is a flag used only on ReFS, and 0x2000 means that the file is not to be indexed. The 0x4000 flag says it is encrypted...
However, the flag that is expected is 0x400 (FILE_ATTRIBUTE_REPARSE_POINT) which seems not to be present in your examples.
error("this is my flag: %x", (int)file_attr_to_st_mode(fdata.dwFileAttributes, findbuf.dwReserved0));
Ah, I see. Your output shows the already translated attributes. Maybe you could show the untranslated attributes, i.e. the raw fdata.dwFileAttributes
and findbuf.dwReserved0
.
Ok, here you go.
This is the output when run against the repo on a local disk (both paths are symlinks, first is pointing to a file and second is pointing to a dir):
error: getting attribs for: devops/provisioning/terraform/test/argent/common.tf
error: dwFileAttributes: 1056
error: dwReserved0: 2684354572
error: translated attribs: a180
error: getting attribs for: devops/provisioning/ansible/envs/infra/prod/group_vars/app_dbc0/00_vars
error: dwFileAttributes: 1040
error: dwReserved0: 2684354572
error: translated attribs: a180
And this is the output when run against the same symlinks (but this time the repo is on a UNC mapped drive rather than local disk):
error: getting attribs for: devops/provisioning/terraform/test/argent/common.tf
error: dwFileAttributes: 128
error: dwReserved0: 0
error: translated attribs: 8180
error: getting attribs for: devops/provisioning/ansible/envs/infra/prod/group_vars/app_dbc0/00_vars
error: dwFileAttributes: 16
error: dwReserved0: 0
error: translated attribs: 4180
Gist of updated hacked mingw.c I used to produce the above: https://gist.github.com/corford/5d67e808ec11e1fab44dd0796e12eb60 (not sure if %lu was the correct formatting choice for printing dwFileAttributes and dwReserved0?)
error: getting attribs for: devops/provisioning/terraform/test/argent/common.tf error: dwFileAttributes: 1056
As you can see for yourself by comparing the numbers to the bits listed in https://msdn.microsoft.com/en-us/library/windows/desktop/gg258117(v=vs.85).aspx, 1056 (=0x420) corresponds to FILE_ATTRIBUTE_REPARSE_POINT | FILE_ATTRIBUTE_ARCHIVE
.
error: getting attribs for: devops/provisioning/ansible/envs/infra/prod/group_vars/app_dbc0/00_vars error: dwFileAttributes: 1040
And 1040 (= 0x410) corresponds to FILE_ATTRIBUTE_REPARSE_POINT | FILE_ATTRIBUTE_DIRECTORY
.
So far, so good.
error: getting attribs for: devops/provisioning/terraform/test/argent/common.tf error: dwFileAttributes: 128 [...] error: getting attribs for: devops/provisioning/ansible/envs/infra/prod/group_vars/app_dbc0/00_vars error: dwFileAttributes: 16
And here the trouble starts: 128 (=0x80) is FILE_ATTRIBUTE_NORMAL
and 16 (=0x10) is FILE_ATTRIBUTE_DIRECTORY
.
That means that when Git asks the Win32 API to get the file attributes, it does not receive any information at all that the reported file/directory is, in fact, a symlink!
Obviously, there is nothing Git can do to second-guess those answers.
I could imagine that Samba mistakes your "follow symlinks = yes" to imply that you want symbolic links to be transparently resolved on the server side already.
But then, even studying Samba resources for over one hour (which I hope you did, too, as it is your desire to export symlinks via Samba to Windows clients) did not offer any insights how Samba wants to be configured to represent symbolic links as reparse points.
However, when i look at the source code of Samba with git grep
(or using GitHub's search functionality: https://github.com/samba-team/samba/search?utf8=%E2%9C%93&q=file_attribute_reparse_point&type=), I see no hint at any such support.
A web search turned up this link where somebody offered a patch to implement the support to represent symlinks as reparse points: http://samba.2283325.n4.nabble.com/PATCH-smbd-add-reparse-points-at-symlinks-td4695178.html. But nothing seems to have come of it.
Please note that I used open resources to figure all of this out, no magic required (not even house-elf magic).
Also, at this point I had to realize that this ticket is not about Git at all, but about Samba. Much as I like to help Open Source, there is a lot more that I have to help the Open Source project Git, so I cannot afford to spend more time on this ticket. But then, it is not even my responsibility: you're a grown-up, too :smile:
Thanks for spending some time on this (and yes, believe me I have spent countless hours researching it myself too).
My goal with the issue was to try and determine where the bug lay (i.e. with git for windows or somewhere else). Sadly I'm not sure we have conclusively figured this out.
As mentioned before, GfW exhibits the same issue with NFS shares i.e. it is nothing to do with samba.
Also, in both cases (samba share or NFS share), Windows itself has no issue understanding the symlinks. If I navigate to the repo (via a UNC path) using Windows File Explorer, open a symlink, edit the contents and save it, the target of the symlink is correctly updated. This simply does not happen with G4W (since it treats the symlink as a real file and so any changes made do not propagate to the target of the symlink).
It's very hard to figure out where the problem lies. If it was the Win32 API, one would imagine Windows itself would also have the same problem with symlinks on UNC paths as GfW does. Yet this is not the case.
I will continue looking for an answer and if I find anything relevant will come back to this ticket and share it. Your help with debugging mingw.c and eliminating that as the cause has been much appreciated :)
Ok, borrowing some inspiration from https://blogs.windows.com/buildingapps/2016/12/02/symlinks-windows-10/ (and factoring in what has been confirmed with the mingw.c testing) it seems the issue is definitely not with GfW.
Opening an unprivileged windows command prompt and issuing dir
in a directory of the repo containing a symlink correctly identifies it as such when the repo is on a local drive.
Doing the same (but this time in the repo located on a UNC path) wrongly shows the symlink as a normal file.
So the attribs returned by the Win32 API are consistent with what we saw with GfW.
Despite this, opening and editing a symlink on a UNC path (e.g. with Notepad) correctly propagates changes to the target. Clearly Windows isn't doing this (as it thinks the symlink is a real file) so I'm back to looking at samba and NFS. If I've understood http://samba.2283325.n4.nabble.com/PATCH-smbd-add-reparse-points-at-symlinks-td4695178.html it seems samba (and I guess NFS) currently present symlinks to Windows as normal files/dirs and then handle change propagation to the targets themselves. Shame that patch never got any take up from the Samba team :(
Closing this issue. Thanks for the help (and patience!)
yes, believe me I have spent countless hours researching it myself too
Good, thanks for correcting my wrong assessment.
If I've understood http://samba.2283325.n4.nabble.com/PATCH-smbd-add-reparse-points-at-symlinks-td4695178.html it seems samba (and I guess NFS) currently present symlinks to Windows as normal files/dirs and then handle change propagation to the targets themselves.
Have you seen any information beyond that patch? I only did a quick web search (on the hunch that "reparse point symlink samba" would make for good search terms in this case)...
Shame that patch never got any take up from the Samba team :(
It looked to me as if they just lost track of it, probably due to lack of time on the contributor's side. It would seem as if Jeremy's hints how to improve the patch so it can be included are the last thing in the thread.
I could imagine that it would help Samba enormously if an enthusiastic developer with the need to see this fixed (such as yourself :smile:) would reply to Jeremy's mail, asking for assistance to improve that patch.
It can be quite intimidating to work on an unfamiliar code base, on a patch that is not your own, and my experience shows that typically the active maintainers/developers of Open Source projects are quite understanding and patient. If I were you, I'd give it a try. Who knows, maybe you can get what you want (symlinks via Samba) and get to know a few Samba experts in an exciting adventure?
Have you seen any information beyond that patch?
I've done a google deep dive but haven't turned anything up beyond that mailing list discussion. It seems the issue is too esoteric to attract much interest (although I suspect there are many people out there that would benefit from real symlink support being added to Samba).
I think you're right that Jeremy lost track of it despite being enthusiastic. Samba is no different from most OS projects. They need to be ruthless on where they place their priorities and I don't blame him at all for not focusing on this.
I could imagine that it would help Samba enormously if an enthusiastic developer with the need to see this fixed
I'd love to have a crack at getting a patch accepted but the sad truth is my C skills are simply not up to the task and I haven't got enough spare time (at the moment at least) to remedy this. I would feel guilty posting to the samba list asking for a feature and not being able to meaningfully contribute to make it happen.
However, if I find enough time to improve my C fu or are am able to sponsor the dev work, I'll definitely pick this up again and see if I can get a patch accepted!
Setup
Windows is running with Developer mode enabled, user account is in the Administrator group and UAC is disabled.
User account's global .gitconfig (C:\Users\username\.gitconfig) contains 'symlinks = true'
Details
Issue manifests regardless of terminal/shell used (I see it using bash.exe directly, using the git integration of vscode and using the git integration of Webstorm)
Symlinks located on UNC paths should be treated in the same way as symlinks on local media
When working on a repository located on a UNC path (e.g. samba network share), GfW treats symlinks as copies of their target, essentially converting them in to real files.
This is the smb.conf file used on the linux guest (a Vagarant Ubuntu xenial box with private network enabled on eth1):
Samba version:
Create samba user by running:
Why?
Using samba to export the repo from a guest vagrant machine is a nice way to ensure 'hot reload' works when working on frontend code (since virtualbox shared folders don't support mtime). It also makes it easy to keep file permissions correct since you can set filemode=false on the Windows host and then use git on the vagrant guest whenever you need to tweak a file's perms (e,g. make it executable). Vagrant shared folders don't let you change permissions on files or dirs after the folder is mounted in the guest.