Open Liz4v opened 8 years ago
Hi! Looking at the logfile generated from your repo and selecting all related info...
1431577825|Ekevoo|A|/oneall/django_app/models.py
1433849539|Ekevoo|M|/oneall/django_app/models.py
1433904622|Ekevoo|M|/oneall/django_app/models.py
1433987454|Ekevoo|M|/oneall/django_app/models.py
1433989141|Ekevoo|M|/oneall/django_app/models.py
1433991898|Ekevoo|M|/oneall/django_app/models.py
That file never gets deleted
Hi Mathieu, would that be an upstream git bug then?
I'm not sure what the command to generate logs is. git log
displays only authors and messages; if I add --dirstat
it looks markedly different than what you posted.
i have generated this log with gource... using (posted from the readme):
gource --output-custom-log my-project-custom.log
maybe use gitk to browse
Well, that's still on gource then.
That's very true. Hence the: maybe use gitk to browse - meaning look at your git logs - gource did not invent the file - Could be coming from a merged branch? I wish i could be of more help.
You can see the command used by gource to generate the input log file from git:
gource --git-log-command
Currently:
git log --pretty=format:user:%aN%n%ct --reverse --raw --encoding=UTF-8 --no-renames
You can run your own command and save the output to a file. Providing the file is in the same format Gource will read it. If there is a more accurate command it could use it would be good to know.
Okay, I've investigated a lot, and here's the missing pieces.
There were two branches during June last year. There was a lot of activity in the green develop
branch, and there was a bugfix for the models.py
file in the black/purple master
branch.
One of the first things done in the develop
branch was exactly a directory move, that was properly handled by gource as expected. Because of the intertwined activity in the master
branch made the models.py
file re-appear, which is a bit weird, but completely understandable.
However, down the road, there's a merge commit 0d32878
and it includes a delete of that file (oneall/django_app/models.py
) along with several other modifications.
Still, that command (which I modified to display the commit hash) does not list a single modified file for this particular commit! Only the previous one (4ec4c6e
) and the next one (da69603
).
$ git log --pretty=format:user:%aN%n%ct\ %H --reverse --raw --encoding=UTF-8 --no-renames
(…snip…)
user:Ekevoo
1438223791 4ec4c6eb8a88e11a46790d7f3d5492f7d31c6c84
:100644 100644 1a8328f... 09c6d8e... M oneall/django_oneall/management/commands/legacyimport.py
user:Ekevoo
1438224809 0d328789f4ea0f802de4dbbafea3605184d5c72c
user:Ekevoo
1441078837 da6960312cfa8a601ceff1a4a7378384f1a372ef
:100644 100644 eb27d58... 8ab8ca4... M oneall/django_oneall/auth.py
:100644 100644 cd83385... 3e4cc68... M oneall/django_oneall/templates/oneall/login.html
:100644 100644 6bc1701... 3f7adf9... M oneall/django_oneall/views.py
(…snip…)
I'm not sure what to suggest now.
I have a little bodged script that passes through the log generated by gource's default git log command to make sure no deleted files get modified, thus re-added:
files = set()
def test_file(file, action):
if action == 'A':
files.add(file)
return True
elif action == 'D':
try:
files.remove(file)
except KeyError:
return False
return True
elif action == 'M':
if file in files:
return True
return False
f = open("new_log.txt", "a+")
for line in open("log.txt", "r"):
l = line.split("\t")
if len(l) == 2:
if test_file(l[1], l[0][-1]):
f.write(line)
else:
f.write(line)
This also happens on my repository.
This file here has never existed in my repository for over a year now. It's some weird Microsoft Frontpage junk file.
Same thing goes for this file whose name was unfonturaly poorly named so I had to blur it. The file's parent folder was renamed, and the file was later deleted.
The best method to solve this problem that I found is to just linearize your git history first with
git filter-branch --parent-filter 'cut -f 2,3 -d " "'
before you run gource
. This will just avoid any kind of problem with files not disappearing due to merge commits.
ATTENTION: Do this with a fresh checkout, not with something you are working on!
There must be a better solution than rewritting history.
From my perspective the git log output is insufficient because it omits merge commits. This problem would not exist if merge commits were considered.
If we add the option --first-parent
to git log the problem seems to solve itself. I will open a PR with the changes.
I generated a Gource video of one repository of mine: https://github.com/leandigo/django-oneall
It seems to assume that the file
oneall/django_app/models.py
is still around at the end. Alas, it was removed at revisiond34833f
Used Gource v0.43 on Mac 10.11.4 Homebrew.