newren / git-filter-repo

Quickly rewrite git repository history (filter-branch replacement)
Other
8.39k stars 703 forks source link

How to change committed date value for commits from dedicated date? #607

Open mihalt opened 2 weeks ago

mihalt commented 2 weeks ago

I want to do something like this

git filter-repo --commit-callback "
    commit_date_str = commit.committer_date.decode('utf-8')
    timestamp_str, offset_str = commit_date_str.split(' ')
    timestamp = int(timestamp_str)

    if timestamp > 1726352100:
        commit.committer_date = commit.author_date
"

But when I run it, it changes hashes of commits before 1726352100 too. How to prevent this and change just commits after 1726352100 date?

Also I would like to not delete all remote repositories after changes.

newren commented 1 week ago

But when I run it, it changes hashes of commits before 1726352100 too. How to prevent this and change just commits after 1726352100 date?

Are the other commits whose hashes are changed ones which have a parent (or further back ancestor) which had a commit timestamp after 1726352100?

mihalt commented 1 week ago

But when I run it, it changes hashes of commits before 1726352100 too. How to prevent this and change just commits after 1726352100 date?

Are the other commits whose hashes are changed ones which have a parent (or further back ancestor) which had a commit timestamp after 1726352100?

no. And that's why I ask you. I see all commits changed and duplicated from my initial project commit which is more than 4 years old.

newren commented 1 week ago

no. And that's why I ask you. I see all commits changed and duplicated from my initial project commit which is more than 4 years old.

Okay, that removes the most likely explanation. There are some others. Does your history have any of these?:

You can probably check all of the above by running git fast-export --no-data | git fast-import --force (in a new fresh clone of your repo), which does a rewrite of history that simply reads history and then writes it, meaning it makes no changes other than making sure to canonicalize things. Then check whether that gives you new hashes for all your commits. If so, it's just the canonicalization that fast-export and fast-import do to history that is the issue here; since git-filter-repo uses those underneath to do the heavy lifting, it gains the automatic canonicalization of those tools.

An alternative way to check what is happening is to find the earliest commit in your history that has a different hash before and after the rewrite. Then once you've found ${OLD_HASH} and ${NEW_HASH} go and run git cat-file -p ${OLD_HASH} and git cat-file -p ${NEW_HASH} and compare the output. Something will be different between the two which will be the cause of why their hashes are different.

Anyway, if it is due ot canonicalization of history as it rewrites, then it's fundamental to the tool. If that's the case, your only option to avoid having older history rewrite it is to have git-filter-repo not process those parts of history you don't want it to rewrite. You can do that with the --refs option, where you specify the range(s) that you do want it to rewrite and it'll avoid reading or writing anything outside the range.