Open kevinhendricks opened 4 years ago
Sigil (a crossplatform GPL opensource C++ ebook-editing project, embeds a Python 3.7.2 or later interpreter) and now uses dulwich to provide a simple "checkpoint" and revert functionality for epub developers.
Thank you for dulwich!
That said, there are many places that the porcelain dulwich code seems to be missing pieces (even for our limited functionality needs).
BTW: As I remember, the original git code used a python (2.X) based "merge" routine that might be easily adapted for use in dulwich at least as a starting point to add merge.
(In git 1.0 to 1.3 timeframe see gitMergeCommon.py and git-merge-recursive.py).
We had to derive the following code pieces in order to get basic checkpoint revisions working (things that do not depend on a "merge" capability).
So are you accepting external user contributed routines that temporarily provide or extend porcelain functionality until you get a chance to get proper support in place?
Just something to clean the working directory in a half-intelligent manner would help as well as basic "git checkout" support to checkout branches and tags (not a pull so no "merge" needs to be done") would help many I would guess. As would a proper fix for this original "issue".
See: https://github.com/Sigil-Ebook/Sigil/blob/master/src/Resource_Files/python3lib/repomanager.py
for examples of how we are using dulwich.
Side Note: dulwich server.py uses argv=sys.argv in its function argument lists but when used with embedded python less that Python 3.8 on Linux, sys does not have a argv attribute which causes an error. This has been fixed with Python 3.8 but afaik, no fixes exist for Python 3.7, 3.6, 3.5.
Please ignore all of this is not something you are looking for. I just wanted to offer some code snippets to help other users of dulwich.
Thanks again for providing dulwich. It is much needed and appreciated.
Just let us know if we can help in anyway.
Hi Kevin,
That's a lot of different topics, most of which are unrelated to this bug. Would you mind if we moved it to the mailing list (https://groups.google.com/forum/#!forum/dulwich-discuss) ? In general, I prefer issues for each individual bug/feature request, so that we can track and work on them independently.
The porcelain module is indeed still in its infancy; any help to improve it would be much appreciated. In general, I would prefer new functionality in porcelain to mirror what is available in the C Git porcelain - but if for a good reason we can't do that (e.g. merge support is too far away) I have no objections to diverging from that.
To answer the specific issues you raise:
Since Dulwich has a more liberal license (Apachev2+GPL2) than C Git (GPLv2 only), we can't just import code from C Git. I've been trying to reach out to the other copyright holders of the merge3 Python module (https://github.com/breezy-team/merge3) to see if we can get it relicensed to something that is Apache2 compatible. (#452)
routine to clean out the working directory but leave .git and desired pieces behind Is this the equivalent of "git clean -f -x -d" ? There is a TODO about this in the porcelain.py file; the existing clean() method handles the -f and -d bits, but not the -x but that should be fairly straightforward to add.
routine to do a clean checkout of a tag into the working directory I think this can be done with a combination of setting the branch, Repo.reset_index() and clean().
code to properly work around the lack of a git add -A option A pull request to add the equivalent of an -A option to dulwich.porcelain.add() would be awesome.
4> add a diffstat.py module to replace the external binary diffstat call currently used for log summaries (we support macOS, Windows, and Linux) A pull request to add a dulwich.diffstat module (under Apachev2+GPLv2) would be great.
- generating a tag list when used with tag objects showing tag object info etc. Doesn't dulwich.porcelain.tag_list() already do this?
Just something to clean the working directory in a half-intelligent manner would help as well as basic "git checkout" support to checkout branches and tags (not a pull so no "merge" needs to be done") would help many I would guess. As would a proper fix for this original "issue". A combination of dulwich.porcelain.clean() and Repo.reset_index() should do this,
Let me know if I can help by e.g. providing pointers or in other ways.
Jelmer
Please move this wherever it seems most appropriate to you.
The dual license is going to be a bit of an issue. With Sigil being GPL it allows us to use other GPL software with little thought.
I believe the hg patch code had a BSD implementation of diffstat in python. If I am remembering correctly, I will take a peek to see.
When I get a free moment I will also take a stab at adding unstaged to your porcelain add and generate a PR. Any code I author you can use under whatever license you want.
I will also take a shot at a checkout tag routine since there is no merge required for that.
I am not up to speed on dulwich yet but based on your porcelain.status() I was hoping something like the following might be useful for this issue
--- porcelain.py~ 2020-04-02 15:21:02.000000000 -0400
+++ porcelain.py 2020-04-02 15:30:19.000000000 -0400
@@ -404,6 +404,11 @@
if not paths:
paths = list(
get_untracked_paths(os.getcwd(), r.path, r.open_index()))
+ # extend with modified but unstaged files (git add -A)
+ normalizer = r.get_blob_normalizer()
+ filter_callback = normalizer.checking_normalize
+ paths.extend(list(
+ get_unstaged_changes(r.open_index(), r.path, filter_callback)))
relpaths = []
if not isinstance(paths, list):
paths = [paths]
On Thu, 2020-04-02 at 12:38 -0700, Kevin Hendricks wrote:
I am not up to speed on dulwich yet but based on your porcelain.status() I was hoping something like the following might be useful for this issue
--- porcelain.py~ 2020-04-02 15:21:02.000000000 -0400 +++ porcelain.py 2020-04-02 15:30:19.000000000 -0400 @@ -404,6 +404,11 @@ if not paths: paths = list( get_untracked_paths(os.getcwd(), r.path, r.open_index()))
extend with modified but unstaged files (git add -A)
- normalizer = r.get_blob_normalizer()
- filter_callback = normalizer.checking_normalize
- paths.extend(list(
- get_unstaged_changes(r.open_index(), r.path, filter_callback))) relpaths = [] if not isinstance(paths, list): paths = [paths] That's the right kind of approach for getting the modified paths.
After playing with "git add -A" a little bit locally and comparing it with the behaviour of "git add ." I'm a bit puzzled as to what the actual difference is - git add without -A also seems to stage modified changes in the working tree. What am I missing - is -A just a no-op, and should we be adding this behaviour for all code paths in dulwich.porcelain.add?
Jelmer
git add -A with no list of paths fully syncs the repo to the current working directory including handling removals that were not staged by first using git rm. Basically it stages all of the changes and makes the repo look exactly like the current working directory
Yes, I think a "git add ." should stage the unstaged but modified files as well as the untracked files, but I rarely use that approach unless initializing a repo at the start from an existing directory.
This is what I found from the git docs ...
-A
--all
--no-ignore-removal
Update the index not only where the working tree has a file matching <pathspec> but also where the index already has an entry. This adds, modifies, and removes index entries to match the working tree.
If no <pathspec> is given when -A option is used, all files in the entire working tree are updated (old versions of Git used to limit the update to the current directory and its subdirectories).
FWIW, I think just removing the (git add -A) from the end of the comment line, to prevent confusion should work. I was basing this bug report on your docs which said all "modified" files. It did not say all "deleted" files, so I handle deletions first then use your add to force the repo to match what is currently in the working dir but found I had to manually add the unstaged files in a path list to get it to work.
The difference between git add .
and git add -A
I found in stackoverflow:
If you are located directly at the working directory, then git add -A and git add . work without the difference. If you are in any subdirectory of the working directory, git add -A will add all files from the entire working directory, and git add . will add files from your current
Then I think procelain.add()
without paths
should work as git add -A
See a related issue https://github.com/dulwich/dulwich/issues/895
porcelain.add does not work as documented: "path - Paths to add. No value passed stages all modified files"
(emphasis is mine)
From looking at the porcelain.py code, If paths is none it only adds untracked changes and not unstaged changes.
So procelain.add(repo='.', paths=None) does not work like:
git add -A
FWIW, should be an easy fix to add unstaged to it.