comet-ml / issue-tracking

Questions, Help, and Issues for Comet ML
https://www.comet.ml
85 stars 7 forks source link

Git patch contain whole repository because of line-ending on windows #133

Closed 7kbird closed 11 months ago

7kbird commented 6 years ago

The uploaded (large) patch contains every file of my repository just because of the line-ending. The patch looks like the this for every file:

diff --git a/.gitignore b/.gitignore
index 1c1641c..932ce34 100644
--- a/.gitignore
+++ b/.gitignore
@@ -1,5 +1,5 @@
-*.pyc
-*.csv
-.idea/
-data/
-*.hdf5
+*.pyc
+*.csv
+.idea/
+data/
+*.hdf5

There's no difference for each line except the line-ending. Git pull windows style line-ending and commit with unix style on windows when core.autocrlf option is true(by default). But comet_ml cannot handle these and consider every line as changed. That will create a big patch for every experiment.

I find that _get_unstaged_changes() in comet_ml/__init__.py call dulwich.index.get_unstaged_changes() which consider every file as 'unstaged'. I guess it's the dulwich problem like this.

By the way, when reviewing the code, I found that log_git_patch parameter of Experiment is never passed to its super and may never works. It can be easily fixed.

Nimrod007 commented 6 years ago

Hey @7kbird , Thanks for the detailed explanation. we are taking a look into this. might take some time since its not a trivial / quick fix. will update once once beta so you can start using it.

Nimrod007 commented 5 years ago

Hi @7kbird , do you .gitattributes files in the repo ?

7kbird commented 5 years ago

@Nimrod007 There's no .gitattributes in my repo. I guess you are asking about the text attribute but I don't change any git attribute in my repo. It's the global core.autocrlf option change the line ending.

Here's my git config --list output

core.symlinks=false
core.autocrlf=true
core.fscache=true
color.diff=auto
color.status=auto
color.branch=auto
color.interactive=true
help.format=html
http.sslcainfo=E:/c/Program Files/Git/mingw64/ssl/certs/ca-bundle.crt
diff.astextplain.textconv=astextplain
rebase.autosquash=true
credential.helper=manager
user.name=xx
user.email=xx@xx.com
difftool.sourcetree.cmd='D:/program/beyondcompare4/BComp.exe' "$LOCAL" "$REMOTE"
sendpack.sideband=false
pack.packsizelimit=2047m
pack.windowmemory=50m
pack.deltacachesize=2047m
core.compression=9
core.packedgitlimit=512m
core.packedgitwindowsize=512m
credential.helper=manager

and my git config --list --show-origin output

file:"C:\\ProgramData/Git/config"       core.symlinks=false
file:"C:\\ProgramData/Git/config"       core.autocrlf=true
file:"C:\\ProgramData/Git/config"       core.fscache=true
file:"C:\\ProgramData/Git/config"       color.diff=auto
file:"C:\\ProgramData/Git/config"       color.status=auto
file:"C:\\ProgramData/Git/config"       color.branch=auto
file:"C:\\ProgramData/Git/config"       color.interactive=true
file:"C:\\ProgramData/Git/config"       help.format=html
file:"C:\\ProgramData/Git/config"       http.sslcainfo=E:/c/Program Files/Git/mingw64/ssl/certs/ca-bundle.crt
file:"C:\\ProgramData/Git/config"       diff.astextplain.textconv=astextplain
file:"C:\\ProgramData/Git/config"       rebase.autosquash=true
file:"E:\\c\\Program Files\\Git\\mingw64/etc/gitconfig" credential.helper=manager
file:C:/Users/7kbird/.gitconfig user.name=xx
file:C:/Users/7kbird/.gitconfig user.email=xx@xx.com
file:C:/Users/7kbird/.gitconfig difftool.sourcetree.cmd='D:/program/beyondcompare4/BComp.exe' "$LOCAL" "$REMOTE"
file:C:/Users/7kbird/.gitconfig sendpack.sideband=false
file:C:/Users/7kbird/.gitconfig pack.packsizelimit=2047m
file:C:/Users/7kbird/.gitconfig pack.windowmemory=50m
file:C:/Users/7kbird/.gitconfig pack.deltacachesize=2047m
file:C:/Users/7kbird/.gitconfig core.compression=9
file:C:/Users/7kbird/.gitconfig core.packedgitlimit=512m
file:C:/Users/7kbird/.gitconfig core.packedgitwindowsize=512m
file:C:/Users/7kbird/.gitconfig credential.helper=manager

So you can find core.autocrlf=true option is set in global config, which is the default setting when install git on windows.

Nimrod007 commented 5 years ago

thanks for the help. we are working on a fix for this.

Lothiraldan commented 5 years ago

@7kbird We released a new version of the SDK (https://pypi.org/project/comet-ml/1.0.46/) that includes a fix for your bug. Please let us know if there is any issue with the new release

Lothiraldan commented 5 years ago

@7kbird Did you had the opportunity to test the new version of the SDK and see if the fix works in your setup?

7kbird commented 5 years ago

Sorry for the late response.

I've upgrade comet-ml to 1.0.50 but the issue is still the same.I dig into the code and comet_ml.git_logging.get_git_patch() still return a large patch that just makes every \n into \r\n

Lothiraldan commented 5 years ago

@7kbird Do you have dulwich installed in your environment? Did you installed it yourself or is it a dependencies of your apllication?

Could you give us the version of comet-git-pure and try updating comet-git-pure while uninstalling dulwich?

7kbird commented 5 years ago

Now after uninstalling dulwich and reinstall comet-git-pure with pip install -I -U comet-git-pure. It says Successfully installed certifi-2019.3.9 comet-git-pure-0.19.11 urllib3-1.24.1 and dulwich appeared in my site-packages.

But after reinstall comet-git-pure, get_git_patch(repo) still give a large patch with same problem.

Lothiraldan commented 5 years ago

@7kbird That's normal, comet-git-pure is installing dulwich in your site-packages. I will continue investigating Git behavior and see how it differs from Dulwich and keep you posted.

Lothiraldan commented 5 years ago

@7kbird Were the files in your repository committed with windows style line-ending (CRLF) or unix style line-ending (LF)?

Also are you creating new commits on Windows or just checkout the repository to launch something?

7kbird commented 5 years ago

Were the files in your repository committed with windows style line-ending (CRLF) or unix style line-ending (LF)?

In the issue above, I wrote

Git pull windows style line-ending and commit with unix style on windows when core.autocrlf option is true(by default)

So I write windows style line-ending and commit with unix style.

Also are you creating new commits on Windows or just checkout the repository to launch something?

I make commits but simple checkout still makes same problem. Here's an example

The patch file: patch.zip

Env:

github-actions[bot] commented 11 months ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] commented 11 months ago

This issue was closed because it has been stalled for 5 days with no activity.