Chaosthebot / Chaos

A social coding experiment that updates its own code democratically.
http://chaosthebot.com
MIT License
2.44k stars 210 forks source link

Automatic Git Reverts #483

Closed rudehn closed 7 years ago

rudehn commented 7 years ago

It seems lately chaosbot has been reverting a lot of commits due to the automatic rollback feature enabled. Determining when this happens is not easy.

Can we get a better message (like an issue opened) indicating a revert took place?

Swizz commented 7 years ago

Is Chaosbot can open a PR with the revert and merge it ?

PlasmaPower commented 7 years ago

@Swizz bad idea - then after issues like the voters.json corruption it'll be a pain to clean up.

PlasmaPower commented 7 years ago

Yeah my symbol emoji PR seems to have gotten reverted, any idea why?

rudehn commented 7 years ago

Several PRs were reverted. My suspicion is it was due to the issue commands exceptions - the deleted command comments not being handled. Since the fix only just went in, any new PRs were just continually reverted.

Swizz commented 7 years ago

So, basically some functionalities have passed the vote stage but the bot reverted them ? Is this really democratic ?

rudehn commented 7 years ago

That's a suspicion, but not guaranteed what happened. Also as soon as a fix goes in for the PR which caused exceptions, everything will come back.

mdcfe commented 7 years ago

@Swizz They were reverted due to invalid syntax introduced in #461, and because no syntax fixes were merged before those commits were, they were also reverted.

anythingbot commented 7 years ago

So, about the automatic rollback feature...I just want to discuss some alternatives for a minute.

Suppose we had two servers running on chaosthebot.com, a production server and a development server that tests out PRs that passed. The bot running on root can just wait until the dev bot boots and does whatever for at least 60 seconds or 10 minutes. The prod bot can ask the dev bot how long it has been up and take appropriate action (for example the dev bot can listen for a TCP connection on port 3000 and serve up its own uptime in seconds; the prod bot can poll the dev bot every five minutes. When it senses the dev bot has been up 10 minutes, the prod bot will pull the commit into its repository and restart itself).

andrewda commented 7 years ago

Wouldn't we also need a dev repository then?

anythingbot commented 7 years ago

@andrewda nope, the prod repository simply pulls from the dev repository when the prod bot sees that the dev bot has been up for 10 or 15 minutes (and the dev repo is ahead of the prod repo).

Edit: oh wait, maybe the answer to your question is "yes" because I didn't understand correctly...yes there is a "separate" dev repo, but it is cloned from prod, so in other words dev will be a clone of prod (at first).

mdcfe commented 7 years ago

@anythingbot The problem is that the two bots will have to behave in different ways - dev needs to merge and pull as usual, while prod needs to wait for dev to ensure changes are stable. This means that we're going to test apply changes to one bot to see if those changes applied to a different bot will be stable.

There's not an obvious solution to this - if we use the same repo and simply disable the GitHub API on dev (make it return mock values, for example), then we're not ensuring that the GitHub API part is stable, and the same goes for other parts.

rudehn commented 7 years ago

Perhaps we should also serve the code from the server to the webserver, so we can cross-reference if the intended changes took place.

PlasmaPower commented 7 years ago

I think it's still reverting. At the very least, my symbol emojis PR still hasn't taken effect: https://github.com/chaosbot/Chaos/pull/485#issuecomment-305904553.

anythingbot commented 7 years ago

@md678685 the way this can work is as follows: when a PR wins, the bot creates a new branch named after the pr, named master-N where N is the PR number and uses it to test the change. The PR code only propagates into branch master if the dev bot runs master-N error free for 10 to 15 minutes. In detail:

The prod bot switches to branch master-N, merges the PR from github, pushes the branch to github, and swithes back to branch master. Then prod tells the dev bot to pull branch maser-N from github and test it. The dev bot pulls branch master-N, switches to it, and restarts itself. After 10 to 15 minutes, the prod bot checks the dev bot. If the dev bot is alive and well, the prod bot merges master-N into master and pushes it to github.

mdcfe commented 7 years ago

@anythingbot The problem is now that the the dev bot still has to behave normally just like the prod bot - things like responding to PRs, recording votes, etc. If we have two bots running on the same repo at once, there could be conflicts between their behaviour. Also, we'd need to make sure that dev doesn't replicate itself in the case of another PR getting merged and tested, because then we'd lose track of them.

rudehn commented 7 years ago

I think it's still reverting. At the very least, my symbol emojis PR still hasn't taken effect: #485 (comment).

@amoffat any resolution on this? You do have the inside scoop

amoffat commented 7 years ago

So here was the difference between HEAD and upstream

(chaos) root@ubuntu:~/workspace/Chaos# git shortlog @..origin/master 
Lee Bousfield (5):
      merging PR #463: Count contributors' votes, regardless of account age
      merging PR #472: Use symbol emojis instead of human emojis
      merging PR #469: Fix meritocracy.json serialization
      merging PR #471: Mention meritocracy when needed
      merging PR #478: Fix rescinding reviews

Neal (1):
      merging PR #468: Added imps to voting reactions

Phil Rukin (1):
      merging PR #412: Bring back mandatory travis-ci check

Quentin Gerodel (2):
      merging PR #485: Ignore meritocracy.json like voters.json
      merging PR #486: Create and update labels improved, close #476

Who? Me?! (1):
      merging PR #477: Better logging

md678685 (3):
      merging PR #454: Update/add missing dependencies to Vagrantfile
      merging PR #462: Dump meritocracy data alongside voters.json
      merging PR #455: FIX ERROR PAGES WITH THIS ONE SIMPLE TRICK

rudehn (5):
      merging PR #464: start cleaning up issue closing
      merging PR #467: Fix issue polling exception
      merging PR #481: Error handling
      merging PR #487: remove broken auditing
      merging PR #484: fix invalid syntax

Looks like it reverted quite a bit. I went ahead and fast forwarded it so everything is up to date, but if it was continually reverting, there is definitely a problem there

PlasmaPower commented 7 years ago

@amoffat is it still reverting? I don't see how that would happen otherwise.

PlasmaPower commented 7 years ago

Apparently hotfixed: 2f75bd0fde3e3427ce11d8267cebe1bf8a3726e4

rudehn commented 7 years ago

So this reverting might need addressed..

anythingbot commented 7 years ago

@md678685

The problem is now that the the dev bot still has to behave normally just like the prod bot - things like responding to PRs, recording votes, etc. If we have two bots running on the same repo at once, there could be conflicts between their behaviour. Also, we'd need to make sure that dev doesn't replicate itself in the case of another PR getting merged and tested, because then we'd lose track of them.

The dev bot will not be able to test these things out, that is correct. This means that bot features are going to have to be conditional on the configuration the bot starts in: if it is a feature that uses a resource that isn't in the dev configuration, then the feature can't be tested in dev, it is a prod only feature. I know this sounds like it defeats the point of dev, but it doesn't. It just means that dev will only test some of the bot's capabilities.

Later on we can provide a github mock api for testing out changes to voting code. With something to test against in a dev configuration, the bot will be able to exercise all of its features, but that is a somewhat more involved project than selecting a subset of features that will be tested in dev and creating "dev" and "prod" configurations (as well as the code that switches off prod-only features in dev).

mark-i-m commented 7 years ago

@rudehn chaosbot does open an issue if it rollbacks a commit as soon as it is able to run again... see #494

chaosbot commented 7 years ago

/vote close

This issue hasn't been active for a while. To keep it open, react with :-1:

chaosbot commented 7 years ago

/vote close This issue hasn't been active for a while. To keep it open, react with :-1:

Command Ran