jedbrown / git-fat

Simple way to handle fat files without committing them to git, supports synchronization using rsync
BSD 2-Clause "Simplified" License
621 stars 137 forks source link

git fat pull hangs on linux #46

Open duanem opened 10 years ago

duanem commented 10 years ago

I'm using git-fat on linux with git version 1.8.5.5 and python version 2.6.6.

git fat pull hangs.

My preliminary debugging indicates that filter_gitfat_candidates never completes.

All three data processing threads start (cut_sha1hash, filter_gitfat_candidates, and metadata processing). cut_sha1hash ends, but filter_gitfat_candidates never completes, hanging the program.

I am using the latest git-fat code (is git-fat versioned yet?)

I wanted to report this as soon as possible. I will continue debugging.

Hints to a possible problem would be great.

jedbrown commented 10 years ago

Can you give instructions to reproduce?

Duane Murphy notifications@github.com writes:

I'm using git-fat on linux with git version 1.8.5.5 and python version 2.6.6.

git fat pull hangs.

My preliminary debugging indicates that filter_gitfat_candidates never completes.

All three data processing threads start (cut_sha1hash, filter_gitfat_candidates, and metadata processing). cut_sha1hash ends, but filter_gitfat_candidates never completes, hanging the program.

I am using the latest git-fat code (is git-fat versioned yet?)

I wanted to report this as soon as possible. I will continue debugging.

Hints to a possible problem would be great.


Reply to this email directly or view it on GitHub: https://github.com/jedbrown/git-fat/issues/46

duanem commented 10 years ago

Can you give instructions to reproduce?

Sure. * Setup git-fat with some binary files • git fat pull * Hang

I have a large repository. git rev-list returns 363110 objects.

I'm not sure if the size is the limitation or not.

The same code does not hang on Mac OS, git 1.8.5.2, python 2.7.5.

jedbrown commented 10 years ago

Does ./test.sh run for you? What protocol are you using? Is the remote host reachable?

Duane Murphy notifications@github.com writes:

Can you give instructions to reproduce?

Sure.

  • Setup git-fat with some binary files • git fat pull
  • Hang

I have a large repository. git rev-list returns 363110 objects.

I'm not sure if the size is the limitation or not.

The same code does not hang on Mac OS, git 1.8.5.2, python 2.7.5.


Reply to this email directly or view it on GitHub: https://github.com/jedbrown/git-fat/issues/46#issuecomment-50923165

duanem commented 10 years ago

Does ./test.sh run for you?

Nope. Hangs in exactly the same spot. I'm using git fat pull, the test hangs at git fat push.

What protocol are you using?
Is the remote host reachable?

Has nothing to do with the remote host. The hang is in referenced_objects, before the host is contacted. git rev-list --object | cut_sha1hash | git cat-file --batch-check | filter_gitfat_candidates | from cat-file --batch. cut_sha1hash finishes, but filter_gitfat_candidates does not.

duanem commented 10 years ago

A ps helps to identify more state:

duane     3862 27132  0 12:33 pts/0    00:00:00 /bin/bash -ex ./test.sh
duane     3948  3862  0 12:33 pts/0    00:00:00 git fat push
duane     3949  3948  0 12:33 pts/0    00:00:00 python /home/duane/bin/git-fat push
duane     3953  3949  0 12:33 pts/0    00:00:00 [git] <defunct>
duane     3954  3949  0 12:33 pts/0    00:00:00 git cat-file --batch-check
duane     3955  3949  0 12:33 pts/0    00:00:00 git cat-file --batch

[git] <defunct> is the left over git rev-list command which has completed. git cat-file --batch-check is still running even though, cut_sha1hash has closed the output to that process.

Why would close not be enough? Hmm...

duanem commented 10 years ago

I upgraded python to 2.7.8. This seems to have fixed the problem.

I'd sure like to know how to work around this problem in python 2.6.

I guess, I'll close this issue?

jedbrown commented 10 years ago

The problem is http://bugs.python.org/issue12786 (fixed in 2.7 and later) but the most convenient fixes have Windows portability problems. I'll fix it.

Duane Murphy notifications@github.com writes:

I upgraded python to 2.7.8. This seems to have fixed the problem.

I'd sure like to know how to work around this problem in python 2.6.

I guess, I'll close this issue?


Reply to this email directly or view it on GitHub: https://github.com/jedbrown/git-fat/issues/46#issuecomment-50969963

duanem commented 10 years ago

Thank you for finding this problem. You're google fu is better than mine.

I look forward to seeing the fix. I did the obvious, addition of close_fds=True to all of the subprocesses involved in the threads. That fixes the problem on Linux, but as the bug indicates there is a problem on Windows. Occasionally, Windows will hang during this process now [sigh].

It seems like the threading is not really providing any benefit as Python is inherently single threaded. I'm considering a rewrite of that section using generators.

jedbrown commented 10 years ago

issue12786 is really a catastrophy for portability. I'm surprised there is no documented best practice. The threads are not for performance, but rather to construct a pipeline containing some non-Python parts and some Python parts. Deadlock is caused by subprocess, not threading.

fishuyo commented 8 years ago

I'm having what appears to be a similar issue..

git fat pull is hanging on OSX 10.10.5 with system python version 2.7.10

However it works fine if I install python using brew which right now is also version 2.7.10