Fix #104: Manage, throttle comments with Redis

openjck commented 9 years ago

Testing these changes:

Push this to your personal Heroku
heroku addons:create heroku-redis:hobby-dev (you'll need to enter a credit card)
heroku ps:scale web=1 worker=1
Push this .doiuse file to your test repo master branch
Submit a pull request to your test repo that adds this file

You should see 20 comments appear within a few seconds. There should be one comment for each property.

To run tests locally, see the updates in CONTRIBUTING.md.

darkwing commented 9 years ago

Is there any type of test we can add here? Seems like a good candidate for one.

openjck commented 9 years ago

Is there any type of test we can add here? Seems like a good candidate for one.

Good question. We could definitely test that items are successfully added to the Redis queue. I would also love to test whether all comments are accepted by GitHub. That would take a lot of work (and testing infrastructure). I would love to help add that but wouldn't want to block on that in the meantime.

groovecoder commented 9 years ago

Does someone really need to use a credit card to test the redis feature? Couldn't we share a dev-hobby REDIS_URL and credentials?

openjck commented 9 years ago

Does someone really need to use a credit card to test the redis feature? Couldn't we share a dev-hobby REDIS_URL and credentials?

Good question. I would love to help find a way to test the feature with entering a credit card. Unfortunately the REDIS_URL environment variable can change so copying it wouldn't be enough.

In order for Heroku to manage this add-on for you and respond to a variety of operational situations, the REDIS config vars may change at any time. Relying on the config var outside of your Heroku app may result in you having to re-copy the value if it changes.

—https://devcenter.heroku.com/articles/heroku-redis#create-a-new-instance

Thankfully the credit card is only needed to verify identity. If I understand correctly, there should be no charges.

Credit card information is not required for free apps without add-ons. It becomes a requirement once you wish to own more than 5 apps at a time, or to use add-ons other than postgresql:dev or pgbackups:plus –– even if the add-ons are free.

As a business, we need to be able to reliably identify and contact our users in the event of an issue. We have found that a credit card on file provides the most reliable way of obtaining verified contact information.

—https://devcenter.heroku.com/articles/account-verification#verification-requirement

openjck commented 9 years ago

Another challenge with using a shared Redis instance is that another worker might process the queue before the worker being tested.

groovecoder commented 9 years ago

Dang, you're already way ahead of me.

teacher

openjck commented 9 years ago

Dang, you're already way ahead of me.

It's a good point. Unfortunately it looks like entering a credit card for verification is going to be the easiest solution. If it's not Redis, there's going to be free add-ons in the future that will require it.

Faeranne commented 9 years ago

How many seconds are we waiting on each comment?

openjck commented 9 years ago

How many seconds are we waiting on each comment?

It depends on the COMMENT_WAIT environment variable. A value of 250 (milliseconds) has been working well for me. If a comment fails it will be re-attempted COMMENT_ATTEMPTS times with an exponentially longer wait between attempts.

Faeranne commented 9 years ago

Is there a way to make it wait longer after several comments? GitHub started blocking after about 20 comments in a row during my tests, and adding a 2-second delay after about 20 comments prevented the spam-protection from triggering at all.

I believe hitting the spam protection multiple times in a row can cause a temporary ban, but I haven't seen anything that suggest how long the ban lasts, or how many times we can hit the spam-protect before the ban.

openjck commented 9 years ago

Good question. I noticed that, too.

It looks like submitting lots of comments in quick succession will trigger temporary spam protection, at which point all future comments will be rejected for around 30-60 seconds. If there's a delay between comments, the spam protection system appears not to be triggered at all. In my testing, waiting for 250 milleseconds between comments is enough for all of the comments to go through successfully.

If a comment does fail, the wait between re-attempts will increase, first to 500 milleseconds, then to 1 second, then 2, then 4, etc.

openjck commented 9 years ago

Hey, nice job figuring out that GitHub sends back a 403 when a comment is rejected. I was wondering why the Error posting pull request error message wasn't showing up in the logs previously.

openjck commented 9 years ago

So I've been thinking about tests. I love writing tests, and love having tests, but to be honest I'm having a hard time thinking of a test that would be valuable here.

Our tests already cover the Redis queueing/processing logic, so those results are definitely still valuable. We could additionally test that the queue itself works as expected, but Kue already covers that.

The real test is whether GitHub accepts all of the comments Discord sends, but we don't have a way of automating tests like that at the moment. I would love to help us build that testing infrastructure down the road, but would prefer not to block on it right now.

Faeranne commented 9 years ago

I doubt there is a good way to test that, but we probably should monitor the success vs reject rate. Is there a way to do that with new relic?

darkwing commented 9 years ago

Would redis be good storage for the success/reject rate?

Faeranne commented 9 years ago

If we can get persistent storage on it, I can't see why not.

openjck commented 9 years ago

Would redis be good storage for the success/reject rate?

Hm. I'm not sure I understand. Do you mean using Redis to log how often comments are rejected as compared to being accepted? Maybe @groovecoder would be able to say more about that. My experience with Redis so far has been that it specializes in job queuing, but I know it can be used for more than that. We're probably going to add Postres this week for record-keeping. Maybe that would be better.

darkwing commented 9 years ago

Oh, right, Postgres would likely be better.

groovecoder commented 9 years ago

Could raise rejections as errors that would bubble up in the New Relic error reporting. That would give us the built-in reporting tools of New Relic around rejections. (Whether that's good or bad.)

Could use Redis for storing data, just be aware that it needs some extra configuration if we want a degree of data safety comparable to Postrgres.

openjck commented 9 years ago

Could raise rejections as errors that would bubble up in the New Relic error reporting. That would give us the built-in reporting tools of New Relic around rejections. (Whether that's good or bad.)

Great point. It would be good to have that information in New Relic. Maybe it would make sense to fire something off to New Relic whenever logger.error is called. What do you think @darkwing?

openjck commented 9 years ago

Come to think of it, while New Relic reporting would be great, do we need to block these improvements on that? If anything, we should see fewer errors with the addition of Redis.

darkwing commented 9 years ago

Excellent! R+!

groovecoder / discord

Fix #104: Manage, throttle comments with Redis #131