Closed Ostrzy closed 8 years ago
How about to check pending commits only in master branch? https://github.com/monterail/mm4-contracts/commit/a3d2bca0ba502a5e852d423eb7ef0fbf3b8de58f vs https://github.com/monterail/mm4-contracts/commit/6ecaced4089746570c27906b976fb489050263af
@Ostrzy @teamon ^
It could be a solution, but sometimes serious errors are found in feature branches before it goes to master and that is z good thing to have
—
Tymon Tobolski CTO Monterail
On 23 Jan 2014, at 14:04, Dariusz Gertych notifications@github.com wrote:
How about to check pending commits only in master branch? monterail/mm4-contracts@a3d2bca vs monterail/mm4-contracts@6ecaced
@Ostrzy @teamon ^
— Reply to this email directly or view it on GitHub.
Yes, but take a look at those two commits. How can we check similarity of them?
I guess most problems come from rebases or cherry picks. I have that idea to concatenate commmit message, author data and changed/added/removed files, hash this and save in database. We can than check similiarity by it.
I have also just found that during rebase and cherry pick, Date is not changed, only Commit Date is added. So if github gives us timestamp based on Date and not Commit Date (and I suspect do, because both commits given by @chytreg have exact same timestamps) we can just save timestamp and check timestamp-author uniqueness.
I guess that commits should not appear on same second for anybody who is not using some kind of automation and in this case it should not be reviewed and therefor commits should be appropriately tagged.
Unless commit have been changed during rebase (because of e.g. merge conflicts)
—
Tymon Tobolski CTO Monterail
On 23 Jan 2014, at 15:03, Michał notifications@github.com wrote:
I guess most problems come from rebases or cherry picks. I have that idea to concatenate commmit message, author data and changed/added/removed files, hash this and save in database. We can than check similiarity by it.
I have also just found that during rebase and cherry pick, Date is not changed, only Commit Date is added. So if github gives us timestamp based on Date and not Commit Date (and I suspect do, because both commits given by @chytreg have exact same timestamps) we can just save timestamp and check timestamp-author uniqueness.
I guess that commits should not appear on same second for anybody who is not using some kind of automation and in this case it should not be reviewed and therefor commits should be appropriately tagged.
— Reply to this email directly or view it on GitHub.
Wasn't there this discussion some time ago already? It was exact same reasoning to ignore merge commits. Conflict can break something in code, but this is not responsibility of reviewer to check this, it's yours. If commit was already accepted, than solving merge conflict should not break it's style and even if it does, we can live with it in better future where you don't have to review same commit multiple times.
I think this is valid reasoning and we should go that way :+1:
—
Tymon Tobolski CTO Monterail
On 23 Jan 2014, at 15:12, Michał notifications@github.com wrote:
Wasn't there this discussion some time ago already? It was exact same reasoning to ignore merge commits. Conflict can break something in code, but this is not responsibility of reviewer to check this, it's yours. If commit was already accepted, than solving merge conflict should not break it's style and even if it does, we can live with it in better future where you don't have to review same commit multiple times.
— Reply to this email directly or view it on GitHub.
I making some test here: https://github.com/chytreg/test/commits/payloads What I found out problem occurs only when rebase branch X into master (X is pushed to remote).
I checked timestamps of thoses commits. They are different!!! @Ostrzy your argument is invalid. monterail/mm4-contracts@a3d2bca vs monterail/mm4-contracts@6ecaced
Next take a look for this example:
Is this our duplicated commit? Yes and NO.
Yes because I'm a human and I know that there same.
NO because:
In the commit payload you don't get the diff what has been changed.
We could try to analyze stats of the similar commits, but we won't be 100% sure that commits are same http://developer.github.com/v3/repos/commits/#get-a-single-commit
Or wait maybe patch attribute will save us:"patch": "@@ -29,7 +29,7 @@\n....."
@chytreg I can't reproduce the situation when commit's timestamp changes. Can you describe the steps?
Master add few commits New branch add few commits Master add few commits Rebase new branch into master
I have used http://requestb.in to analyze messages from github paylods checkout https://github.com/chytreg/test/commits/payloads
@chytreg is right, in webhook we get date of the commit, not date of autorship. It sucks, but we can get necessary data from api. We can call compare for before and after that we get from webhook. There is all necessary data we need - author email and authorship date. We can combine these two with repo id to create unique string. We should probably put github call to the background process and give commits some state before pending.
@Ostrzy Good research. This could work.
I think we can save both in raw form on backend for filtering purposes.
There is small problem with the fact, that we have to access api through user, but it can be resolved by remembering the user that added repository to ghcr and access through his token (we are sure about access to repo). I have no idea what to do when e. g. user stops using app and deauthorize it, as we lose access to api for this repo. Unfortunately I am not able to take care about this issue before June.
Commits can be duplicated - after rebase or cherry picks we end up with multiple instances of same commit. Idea is to make hash from list of changed files, message and probably something else and check also by this hash if commit is already in db.