davbeek / gitinspectorgui

0 stars 0 forks source link

Use machine readable output format in git communication #53

Open Alberth289346 opened 5 hours ago

Alberth289346 commented 5 hours ago

Git has machine readable output forms, but they don't seem used.

eg in repo_reader.py I read (around line 313):

            if "=>" not in line:
                # if no renames or copies have been found, the line represents the file
                # name fstr
                fstr = line
            elif "{" in line:
                # If { is in the line, it is part of a {...} abbreviation in a rename or
                # copy expression. This means that the file has been renamed or copied
                # to a new name. Set fstr to this new name.
                #
                # To find the new name, the {...} abbreviation part of the line needs to
                # be eliminated. Examples of such lines are:
                #
                # 1. gitinspector/{gitinspect_gui.py => gitinspector_gui.py}
                # 2. src/gigui/{ => gi}/gitinspector.py

                prefix, rest = line.split("{")

                # _ is old_part
                _, rest = rest.split(" => ")

                new_part, suffix = rest.split("}")

This is fragile. A filename like x=>y is quite trivially produced (touch x=\>y), filenames with curly braces and spaces can exist too.

Instead of relying on a user to produce non-conflicting filenames, use the machine readable output form (-p or --porcelain for git log or git blame, there is also --line-porcelain).


For the above code fragment, a regular expression is generally recommended so you get the entire pattern, including eg a space character at the start or end of a filename like " my_file ":

>>> s = "gitinspector/{gitinspect_gui.py => gitinspector_gui.py}"
>>> import re
>>> pat = re.compile(r"\{(.*) => (.*)\}")

>>> m  = pat.search(s)
>>> m
<re.Match object; span=(13, 55), match='{gitinspect_gui.py => gitinspector_gui.py}'>
>>> m.groups()
('gitinspect_gui.py', 'gitinspector_gui.py')

>>> s = "src/gigui/{ => gi}/gitinspector.py"
>>> m  = pat.search(s)
>>> m
<re.Match object; span=(10, 18), match='{ => gi}'>
>>> m.groups()
('', 'gi')

If you don't want to use an RE, `split does have an optional max matching count, to avoid getting 2 or more matches.

Alberth289346 commented 5 hours ago

Path concatenation should be done with os.path instead of string manipulations.

To handle the possible }/ that causes the // you can extend the pattern with [\\/]? which matches / or \ so it gets discarded together with the }