initialcommit-com / git-sim

Visually simulate Git operations in your own repos with a single terminal command.
GNU General Public License v2.0
4.17k stars 107 forks source link

--animate log in chronological order of the commits #96

Closed yarikoptic closed 5 months ago

yarikoptic commented 1 year ago

Originally brought up in https://github.com/initialcommit-com/git-sim/issues/95#issuecomment-1624226988 that the order of commits appearing in --animate is not necessarily chronological. In that example commits are actually happening in parallel on two branches -- the default (e.g. master) and supplemental (git-annex) managed by a tool (git-annex).

Note: there is bother Committer and Author dates

❯ git log --format=fuller
commit 9fc5bfff5864d379093089b740b883c2017d1ef3 (HEAD -> master, remote/synced/master, synced/master)
Author:     Yaroslav Halchenko <debian@onerussian.com>
AuthorDate: Thu Jul 6 15:36:14 2023 -0400
Commit:     Yaroslav Halchenko <debian@onerussian.com>
CommitDate: Thu Jul 6 15:36:14 2023 -0400

    Adding file.dat

which might differ. So may be it should be an option with value, e.g. --chronological=(author|commit)

initialcommit-io commented 1 year ago

Thanks for creating this. Ok yes I'll think about the author date vs commit date as well.

One other question - if commits are displayed chronologically, do you expect to see all commits displayed linearly (even if they exist on different branches)? Or do you expect multiple branches to be displayed in parallel in addition to that?

If commits are purely chronological we may want to rethink how the arrows are draw, since by convention the arrows in the Git dag represent parent/child relationships and not time. Thoughts?

yarikoptic commented 1 year ago

One other question - if commits are displayed chronologically, do you expect to see all commits displayed linearly (even if they exist on different branches)? Or do you expect multiple branches to be displayed in parallel in addition to that?

if I got question right -- "in parallel" is the correct answer if it is how --animate log is doing now since IIRC it did the right thing.

If commits are purely chronological we may want to rethink how the arrows are draw, since by convention the arrows in the Git dag represent parent/child relationships and not time. Thoughts?

it could retain that semantic no problem. Time would just define in which order commits with their arrows to parents would appear. IIRC ATM it goes backwards - from most recent commits to parents and so on, and I was thinking about from oldest to newer ones (arrows could still point to parents), and commits appearing in chronological order across branches.

note 1: it would be nice if order (but not necessarily chronological distance) along "horizontal axes" across branches is preserved, i.e. later commit on branch Y comes after earlier commit on branch X

note 2: Sure thing it could be that commits order in a branch would be incongruent with chronological order (clock could be changed backward so child commit is earlier than parent) -- then I guess it is up to decision on how to handle it ;)

initialcommit-io commented 1 year ago

if I got question right -- "in parallel" is the correct answer if it is how --animate log is doing now since IIRC it did the right thing.

I think we are understanding each other, but just in case, here are the 2 ways I'm thinking we could do this:

1) I believe Git's default log output is in reverse chronological order without considering parent/child relationships at all. If a merge commit exists with multiple parents, it is still the timestamps that determine order of all ancestors, which means that commits from both branch histories are interleaved together into a single linear display. But that would be a boring output in git-sim.

2) It sounds like you prefer a "parallel" setup where commits are still displayed based on timestamp, so if there is a merge commit, the 2 merged histories would display "in parallel" to each other instead of being interleaved. And from your "note 1", preferably to stagger the display so that chronological ordering of commits between branches makes sense (not sure how hard this would be to do based on current git-sim implementation, might be better to start without that).

For (2) I think we could start from the most recent commit timestamp and display commits one by one until we reach one with multiple parents. Then we can split the chain so it displays in parallel, and continue each chain using chronological ordering. Each time we find a commit with multiple parents, we split again the same way.

note 2: Sure thing it could be that commits order in a branch would be incongruent with chronological order (clock could be changed backward so child commit is earlier than parent) -- then I guess it is up to decision on how to handle it ;)

Even in Git's default log, I believe that if a child commit somehow has an earlier timestamp than its parent, it would display first in the log (by that I mean later if we're talking reverse-chronological order). So I think we could just match Git's default behavior and not handle this in a special way.

yarikoptic commented 1 year ago
  1. I believe Git's default log output is in reverse chronological order without considering parent/child relationships at all.

It seems indeed the case unless I use --graph option -- then it is "off":

this script which also collected time stamps per above hackery but with 1 sec sleep added ```shell #!/bin/bash # https://github.com/datalad/datalad/issues/7371 # # A helper # with 1 sec delay export PS4='> $(date "+%Y-%m-%d %H:%M:%S.%N"): $(sleep 1)'; log="$(mktemp /tmp/sim-XXXXXXX)" exec 2> "$log" set -x bash -x "$@" echo "Commands with time stamps collected in $log" ```
ran on this one ```shell #!/bin/bash # https://github.com/datalad/datalad/issues/7371 # # do not overload so we could use it outside # export PS4='> ' # set -x set -eu umask 022 cd "$(mktemp -d /tmp/dl-XXXXXXX)" mkdir remote ( cd remote git init git annex init ) # Let's slow down now and have it each 1 second # export PS4='> $(sleep 1)' mkdir origin ( cd origin; git init # creates main branch, no commit git annex init # creates git-annex branch with a commit, also modifies .git/config echo big-data > file.dat git annex add file.dat # creates commit in git-annex branches git commit -m "Adding file.dat" # creates commit in main git annex addurl --file file.dat http://www.oneukrainian.com/tmp/file.dat # only updates git-annex branch if no content change # For now no remote git remote add --fetch remote ../remote # no commits git annex sync # if remote had its own history for git-annex branch -- it would get merged git annex copy --to=remote file.dat # commit in git-annex branch updating availability information # now another file echo text > text.txt git add text.txt && git commit -m 'Added text file' git merge --always HEAD^^ # some fake merge arch echo more-data > another.dat git annex add another.dat && git commit -m 'Added another.dat' ) ( cd origin #git sim log --all git sim --animate log --all ) pwd ```

produces following time stamps

❯ grep '^> ' /tmp/sim-Y2ig55e
> 2023-07-11 10:54:22.163362936: bash -x try-git-sim-longer.sh
> 2023-07-11 10:54:23.169581137: set -eu
> 2023-07-11 10:54:24.173617666: umask 022
> 2023-07-11 10:54:26.184822800: cd /tmp/dl-NVlZEHb
> 2023-07-11 10:54:27.187974748: mkdir remote
> 2023-07-11 10:54:28.194193618: cd remote
> 2023-07-11 10:54:29.198498613: git init
> 2023-07-11 10:54:30.206625548: git annex init
> 2023-07-11 10:54:31.266475134: mkdir origin
> 2023-07-11 10:54:32.274253001: cd origin
> 2023-07-11 10:54:33.278706977: git init
> 2023-07-11 10:54:34.284746511: git annex init
> 2023-07-11 10:54:35.342567494: echo big-data
> 2023-07-11 10:54:36.345593040: git annex add file.dat
> 2023-07-11 10:54:37.395980583: git commit -m 'Adding file.dat'
> 2023-07-11 10:54:38.420397319: git annex addurl --file file.dat http://www.oneukrainian.com/tmp/file.dat
> 2023-07-11 10:54:39.582342733: git remote add --fetch remote ../remote
> 2023-07-11 10:54:40.601416618: git annex sync
> 2023-07-11 10:54:41.850178450: git annex copy --to=remote file.dat
> 2023-07-11 10:54:42.923002010: echo text
> 2023-07-11 10:54:43.926877279: git add text.txt
> 2023-07-11 10:54:44.955743605: git commit -m 'Added text file'
> 2023-07-11 10:54:45.975189719: git merge --always 'HEAD^^'
> 2023-07-11 10:54:46.982942231: echo 'Commands with time stamps collected in /tmp/sim-Y2ig55e'

and following git log --all ... in graph (not quite chronological) and non-graph (seems indeed chronological) modes

❯ git -C /tmp/dl-NVlZEHb/origin log --all --graph --date=iso --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cD) %C(bold blue)[%an]%Creset' --abbrev-commit --date=relative
* bb686d3 - (HEAD -> master) Added text file (Tue, 11 Jul 2023 10:54:45 -0400) [Yaroslav Halchenko]
* 4105ad5 - (remote/synced/master, synced/master) Adding file.dat (Tue, 11 Jul 2023 10:54:38 -0400) [Yaroslav Halchenko]
* 14b4e1f - (git-annex) update (Tue, 11 Jul 2023 10:54:42 -0400) [Yaroslav Halchenko]
*   c67b1a8 - (remote/synced/git-annex, remote/git-annex) merging remote/git-annex into git-annex (Tue, 11 Jul 2023 10:54:41 -0400) [Yaroslav Halchenko]
|\  
| * 3d5a8a4 - update (Tue, 11 Jul 2023 10:54:31 -0400) [Yaroslav Halchenko]
| * 72cb69f - branch created (Tue, 11 Jul 2023 10:54:31 -0400) [Yaroslav Halchenko]
* 09eaefc - update (Tue, 11 Jul 2023 10:54:39 -0400) [Yaroslav Halchenko]
* 932e32f - update (Tue, 11 Jul 2023 10:54:37 -0400) [Yaroslav Halchenko]
* 36bc849 - update (Tue, 11 Jul 2023 10:54:35 -0400) [Yaroslav Halchenko]
* 541f666 - branch created (Tue, 11 Jul 2023 10:54:35 -0400) [Yaroslav Halchenko]

❯ git -C /tmp/dl-NVlZEHb/origin log --all  --date=iso --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cD) %C(bold blue)[%an]%Creset' --abbrev-commit --date=relative
bb686d3 - (HEAD -> master) Added text file (Tue, 11 Jul 2023 10:54:45 -0400) [Yaroslav Halchenko]
14b4e1f - (git-annex) update (Tue, 11 Jul 2023 10:54:42 -0400) [Yaroslav Halchenko]
c67b1a8 - (remote/synced/git-annex, remote/git-annex) merging remote/git-annex into git-annex (Tue, 11 Jul 2023 10:54:41 -0400) [Yaroslav Halchenko]
09eaefc - update (Tue, 11 Jul 2023 10:54:39 -0400) [Yaroslav Halchenko]
4105ad5 - (remote/synced/master, synced/master) Adding file.dat (Tue, 11 Jul 2023 10:54:38 -0400) [Yaroslav Halchenko]
932e32f - update (Tue, 11 Jul 2023 10:54:37 -0400) [Yaroslav Halchenko]
36bc849 - update (Tue, 11 Jul 2023 10:54:35 -0400) [Yaroslav Halchenko]
541f666 - branch created (Tue, 11 Jul 2023 10:54:35 -0400) [Yaroslav Halchenko]
3d5a8a4 - update (Tue, 11 Jul 2023 10:54:31 -0400) [Yaroslav Halchenko]
72cb69f - branch created (Tue, 11 Jul 2023 10:54:31 -0400) [Yaroslav Halchenko]

NB addition of that --date=iso seems to change nothing for me for %cd (I used %cD here though)...

yarikoptic commented 1 year ago

For (2) I think we could start from the most recent commit timestamp and display commits one by one until we reach one with multiple parents. Then we can split the chain so it displays in parallel, and continue each chain using chronological ordering. Each time we find a commit with multiple parents, we split again the same way.

but what about going in reverse order -- from oldest to newest?
I think it might be more useful in many cases whenever people want to demonstrate how actually this repo evolved instead of digging back to history.

initialcommit-io commented 1 year ago

and following git log --all ... in graph (not quite chronological) and non-graph (seems indeed chronological) modes

Hmm, so with the graph why is it showing the second commit 4105ad5 with timestamp (Tue, 11 Jul 2023 10:54:38 -0400) more recently than 14b4e1f and c67b1a8? If that commit was an older one that was merged in I would think the graph would reflect that? But it looks like a part of the linear history so I would assume it would be sorted reverse chronologically? Seems I'm missing something there...

but what about going in reverse order -- from oldest to newest?

I actually had that feature in my original program git-story but for git-sim I realized it was a lot simpler to program and draw the Git history in reverse parent/child order because you can just "walk" down the relationships recursively until the desired number of commits are drawn. Going in chronological order actually requires a bit more complexity. It should be do-able, but might take some time due to all the subcommands implemented now.

yarikoptic commented 1 year ago

Hmm, so with the graph why is it showing the second commit 4105ad5 with timestamp (Tue, 11 Jul 2023 10:54:38 -0400) more recently than 14b4e1f and c67b1a8? ...

if I got it right -- it just confirms my point that it is not chronological in --graph and as for "why": we can only hypothesize: note that those commits come from different branches , so I guess in --graph mode git tries to group commits somehow to have multiple commits from the same branch listed together nearby while sacrificing some chronological order... I would not really worry/think about it much or rely on it -- I would have got commits, sorted them in precedence (child/parent relationship) + chronological (so parallel branches commits could potentially interleave in this line up) as the 2nd factor, and then went animating "events" in the order - drawing across multiple "horizontals" to reflect "branching" structure... Ignorant me doesn't know yet how I would have done actual graph "rendering" though which might get tricky and best accomplished with smth like graphviz.

i messed around with chatgpt a bit to see if it could give me something usable but "we" didn't figure out how to enforce chronological order. FWIW here is the script ```shell #!/bin/bash echo 'digraph G {' git log --all --abbrev-commit --pretty=format:'%h [label="%h %s %ci",shape=ellipse];' --date-order | while IFS=' ' read -r commit label; do echo "\"$commit\" $label" for parent in $(git log --pretty=%P -n 1 $commit); do echo "\"${parent:0:7}\" -> \"$commit\";" done done echo '}' ```

which if ran produces smth like

graph

which isn't good really

initialcommit-io commented 1 year ago

Hahaha nice, I like the idea of referring to yourself and ChatGPT as "we". While we're being honest I also used it to help decipher your shell commands 😸...

Anyway - ok that's true, we don't need to mimic Git's ordering exactly. One thing that's not too encouraging for this is that if you try and run git log --all --graph --reverse to get regular chronological order instead of reverse-chronological, it just gives a fatal error saying that's not possible lol:

$ git log --graph --all --reverse
fatal: options '--reverse' and '--graph' cannot be used together
initialcommit-io commented 5 months ago

Closing without implementation as the effort to implement such a thing likely outweighs the demand from a user perspective. However, if more folks request this I will reconsider.