newren / git-filter-repo

Quickly rewrite git repository history (filter-branch replacement)
Other
8.52k stars 708 forks source link

How to change committed date value for commits from dedicated date? #607

Closed mihalt closed 1 day ago

mihalt commented 1 month ago

I want to do something like this

git filter-repo --commit-callback "
    commit_date_str = commit.committer_date.decode('utf-8')
    timestamp_str, offset_str = commit_date_str.split(' ')
    timestamp = int(timestamp_str)

    if timestamp > 1726352100:
        commit.committer_date = commit.author_date
"

But when I run it, it changes hashes of commits before 1726352100 too. How to prevent this and change just commits after 1726352100 date?

Also I would like to not delete all remote repositories after changes.

newren commented 1 month ago

But when I run it, it changes hashes of commits before 1726352100 too. How to prevent this and change just commits after 1726352100 date?

Are the other commits whose hashes are changed ones which have a parent (or further back ancestor) which had a commit timestamp after 1726352100?

mihalt commented 1 month ago

But when I run it, it changes hashes of commits before 1726352100 too. How to prevent this and change just commits after 1726352100 date?

Are the other commits whose hashes are changed ones which have a parent (or further back ancestor) which had a commit timestamp after 1726352100?

no. And that's why I ask you. I see all commits changed and duplicated from my initial project commit which is more than 4 years old.

newren commented 1 month ago

no. And that's why I ask you. I see all commits changed and duplicated from my initial project commit which is more than 4 years old.

Okay, that removes the most likely explanation. There are some others. Does your history have any of these?:

You can probably check all of the above by running git fast-export --no-data | git fast-import --force (in a new fresh clone of your repo), which does a rewrite of history that simply reads history and then writes it, meaning it makes no changes other than making sure to canonicalize things. Then check whether that gives you new hashes for all your commits. If so, it's just the canonicalization that fast-export and fast-import do to history that is the issue here; since git-filter-repo uses those underneath to do the heavy lifting, it gains the automatic canonicalization of those tools.

An alternative way to check what is happening is to find the earliest commit in your history that has a different hash before and after the rewrite. Then once you've found ${OLD_HASH} and ${NEW_HASH} go and run git cat-file -p ${OLD_HASH} and git cat-file -p ${NEW_HASH} and compare the output. Something will be different between the two which will be the cause of why their hashes are different.

Anyway, if it is due ot canonicalization of history as it rewrites, then it's fundamental to the tool. If that's the case, your only option to avoid having older history rewrite it is to have git-filter-repo not process those parts of history you don't want it to rewrite. You can do that with the --refs option, where you specify the range(s) that you do want it to rewrite and it'll avoid reading or writing anything outside the range.

newren commented 1 day ago

No response; I'll assume my previous comment answered your questions. Let me know if that's not the case, and if so, what you found from the investigation suggestions I made

mihalt commented 1 day ago

no. And that's why I ask you. I see all commits changed and duplicated from my initial project commit which is more than 4 years old.

Okay, that removes the most likely explanation. There are some others. Does your history have any of these?:

* signed commits?

* commits with commit messages in a specific locale other that utf-8?

* commits with other extended headers?

* commits without an author?

* a tree in old history without properly sorted entries or some kind of duplicate entry?

* some other form of non-canonically shaped history?

You can probably check all of the above by running git fast-export --no-data | git fast-import --force (in a new fresh clone of your repo), which does a rewrite of history that simply reads history and then writes it, meaning it makes no changes other than making sure to canonicalize things. Then check whether that gives you new hashes for all your commits. If so, it's just the canonicalization that fast-export and fast-import do to history that is the issue here; since git-filter-repo uses those underneath to do the heavy lifting, it gains the automatic canonicalization of those tools.

An alternative way to check what is happening is to find the earliest commit in your history that has a different hash before and after the rewrite. Then once you've found ${OLD_HASH} and ${NEW_HASH} go and run git cat-file -p ${OLD_HASH} and git cat-file -p ${NEW_HASH} and compare the output. Something will be different between the two which will be the cause of why their hashes are different.

Anyway, if it is due ot canonicalization of history as it rewrites, then it's fundamental to the tool. If that's the case, your only option to avoid having older history rewrite it is to have git-filter-repo not process those parts of history you don't want it to rewrite. You can do that with the --refs option, where you specify the range(s) that you do want it to rewrite and it'll avoid reading or writing anything outside the range.

Looks like, that no changes appeared. And I don't see that any of commits are not the same as my remote

$ > git fast-export --no-data | git fast-import --force 
fast-import statistics:
---------------------------------------------------------------------
Alloc'd objects:       5000
Total objects:            0 (         0 duplicates                  )
      blobs  :            0 (         0 duplicates          0 deltas of          0 attempts)
      trees  :            0 (         0 duplicates          0 deltas of          0 attempts)
      commits:            0 (         0 duplicates          0 deltas of          0 attempts)
      tags   :            0 (         0 duplicates          0 deltas of          0 attempts)
Total branches:           0 (         0 loads     )
      marks:           1024 (         0 unique    )
      atoms:              0
Memory total:          2399 KiB
       pools:          2048 KiB
     objects:           351 KiB
---------------------------------------------------------------------
pack_report: getpagesize()            =      65536
pack_report: core.packedGitWindowSize = 1073741824
pack_report: core.packedGitLimit      = 35184372088832
pack_report: pack_used_ctr            =          0
pack_report: pack_mmap_calls          =          0
pack_report: pack_open_windows        =          0 /          0
pack_report: pack_mapped              =          0 /          0
---------------------------------------------------------------------
newren commented 23 hours ago

Looks like, that no changes appeared. And I don't see that any of commits are not the same as my remote

Can you verify that by comparing the output of git show-ref and git ls-remote ?

Anyway, if the fast-export piped to fast-import didn't give you different hashes, then you'll need to use the other thing I suggested to find out why git-filter-repo changed the commits, namely:

An alternative way to check what is happening is to find the earliest commit in your history that has a different hash before and after the rewrite. Then once you've found ${OLD_HASH} (in a separate clone where the rewrite has not been done) and ${NEW_HASH} (in the clone where you've done the rewrite), go and run git cat-file -p ${OLD_HASH} (in the unrewritten clone) and git cat-file -p ${NEW_HASH} (in the rewritten clone) and carefully compare the output. Something will be different between the two which will be the cause of why their hashes are different.

mihalt commented 3 hours ago

Can you verify that by comparing the output of git show-ref and git ls-remote ?

Generally all refs of refs/heads from git ls-remote are in git show-ref output

What does it mean ${OLD_HASH} and ${NEW_HASH} in your example? I don't have this environment variables.