techtonik / python-patch

Library to parse and apply unified diffs
https://pypi.python.org/pypi/patch
106 stars 63 forks source link

Automatic GIT patch detection fails if filename contains non-word-constituent characters #32

Closed mspncp closed 8 years ago

mspncp commented 8 years ago

Hi,

The attached example.zip contains an example where the patch type detection fails to detect a git patch because the filename contains a dash. See README.txt in the example directory for more details.

Note that I added three diagnostic lines of code to the path.py module in the example directory.

I suggest that the regular expression r'diff --git a/[\w/.]+ b/[\w/.]+' should be extended accordingly.

example.zip

P.S: Thanks for sharing your code!

techtonik commented 8 years ago

That's definitely a bug. Thanks for report. Now I wonder what other possible filename symbols does this regex miss?

mspncp commented 8 years ago

IMHO, the file path does not need to be matched when detecting the patch type, since it is not relevant there. (It is taken from the --- and +++ line later.) Even more, I don't see why you have to make such a fine-grained distinction between the different VCS (GIT vs. HG vs. SVN ...) at all. The only information which you are using later is whether it is a git-like patch or not. It doesn't matter which tool created the patch.

I would remove the VCS type enumeration entirely and replace the 'type' by a boolean 'git_like' flag. Then simply check for the presence of the string 'diff --git' in the header (and optionally for the 'a/' and 'b/' prefixes).

techtonik commented 8 years ago

Yep. I see it now. The fine-grained patch detection was needed for review system (Rietveld) to know which protocol to use to extract parent revision. I didn't integrate it into Rietveld, but this can be a useful functionality for a library.

techtonik commented 8 years ago

Just released new version. Report any issues. =)

mspncp commented 8 years ago

Thanks again. And keep up the good work...