Battle Testing - Githubissues

supagroova commented 8 years ago

Hi there,

I've been looking at this gem and it looks very interesting! You mention that you're currently testing it, and I see the last commit was ~2.5 weeks ago. How is the testing going? Is this close to production ready?

alloy commented 8 years ago

Yup, it’s been running in production for a few weeks now with no problems whatsoever. I was planning on giving it at least a month before I would go v1.

supagroova commented 8 years ago

Ok, thanks a lot for the response! I think we’ll give it a try soon too. Will let you know how we fare.

On 28 February 2016 at 14:35:49, Eloy Durán (notifications@github.com) wrote:

Yup, it’s been running in production for a few weeks now with no problems whatsoever. I was planning on giving it at least a month before I would go v1.

— Reply to this email directly or view it on GitHub.

evrenios commented 8 years ago

i'm waiting for v1 aswell. I will try it on staging this week. FYI: i'm sending millions of push notifications daily basis.

alloy commented 8 years ago

@supagroova Great, thanks!

alloy commented 8 years ago

@evrenios Awesome! In my case it's in the ten thousands, so I'd love to hear your experience. I think that if you both give the green light we can safely mark this as v1, so looking forward to hearing from you.

evrenios commented 8 years ago

@alloy after some heavy testing, it sends 38k push in around 170 second. The open connection count doesnt seem to change anything, i compared with 5-10-15-20-25 with Benchmar#Measure.

pros ; delivery consistency, with the individual responses, you get to know which token is corrupted, and still be able to receive push notification after the bad token.

cons; with that consistency, it lost its speed, WIth the grocer gem, i sent the same amount of push notification in 16 second.

Do you think you can increase the throughput ?

I manage my own gcm system with typehous, and send each post separately to google. Opening and closing each connection every time. And it's able to send 140k push in 50 sec, so i expected much more in terms of performance.

alloy commented 8 years ago

Interesting. I will do some more measuring to see if I can spot an area where it might be lagging, but it would be great if you could do so too, so we can compare notes.

mkonecny commented 8 years ago

Can confirm is working very well for us so far (two weeks in production).

swelther commented 8 years ago

Works well here too, if not send in a delayed job.

How are you guys sending the messages in the background?

swelther commented 8 years ago

After #12 it works now here too with DJ :)

dchersey commented 8 years ago

This looks great! I am concerned about throughput, but not initially. Happy to be part of the early adoption here; please update with any progress on the throughput end!

nextofsearch commented 8 years ago

Hi @alloy I am wondering if you've made progress to increase the throughput. We're sending at least 2M per day so the throughput does matter. I appreciate your great work.

mkonecny commented 8 years ago

@nextofsearch If you are sending 2million messages per day, you should have the resources to make this improvement with your team and make a pull request. Author has done a great job of getting an early version out with decent performance.

nextofsearch commented 8 years ago

@mkonecny I did say that I appreciate the author's work and I really do. You made assumption that my team has the resources but on the contrary we don't so we are considering to use AWS's SNS instead. MYOB.

alloy commented 8 years ago

@nextofsearch Alas, I have not. I guess whether or not that matters for your situation depends on how your notifications are being sent, is it in a few bursts or is it transactional ones throughout the day? Because 2M notifications throughout the day is ~23 notifications/sec, which I don’t believe should be a problem.

I’d love for you to try and report back! But alas I can’t make any promises about when I will have time when it’s not directly related to my business purpose; in that regard @mkonecny suggestion about PRs (or just investigation into causes of bottlenecks) is of course correct, but I totally understand you needing to prioritise according to your own business 👍 And thanks for your appreciation!

alloy commented 8 years ago

@mkonecny @swelther @dchersey Thanks for letting me know!

alloy commented 8 years ago

I’m thinking about calling the current version a v1 soon and then look into possible performance improvements for a v2. Completely naive thoughts (needs more proper profiling!) are:

Replace Celluloid with concurrent-ruby.
Replace pure Ruby HTTP/2 client with a C based version.

nextofsearch commented 8 years ago

@alloy It's in a few bursts hitting the limit of CPU utilization from time to time. We haven't decided yet but if we decide to go with this gem, I will share the report. Thank you again.

fedenusy commented 8 years ago

@alloy FWIW the Celluloid dependency is the only reason I'm staying away from this gem. Celluloid got noticeably slower starting with 0.17. I'd avoid using Celluloid until this issue gets resolved.

That issue's been open for 6 months now, so I'm kinda starting to lose hope for Celluloid's performance. In fact we're thinking about moving internal projects over to Akka.

You should always benchmark, but I'd be willing to bet Celluloid's your bottleneck. If you lock to Celluloid 0.16 you should see immediate performance gains.

IMO you should ditch Celluloid altogether in favor of concurrent-ruby. Worked well for sidekiq.

alloy commented 8 years ago

@fedenusy Yeah I think you’re right and have been looking at concurrent-ruby.

amitsaxena commented 8 years ago

We started using it in production a few days back and we have been seeing errors described here: https://github.com/alloy/lowdown/issues/25 Looks like those "Actor crashed!" errors are mostly due to celluloid. We never saw them in staging environment.

The errors were so frequent that we had to roll back to old style APNS to avoid further end user problems. Bad firefighting night/morning! I'll try to look into what's causing the exceptions once I have had some sleep ;)

alloy commented 8 years ago

Pushed a v1 gem and filed a new ticket for the v2 changes https://github.com/alloy/lowdown/issues/27.

alloy / lowdown

Battle Testing #11