lnug / speakers

Open an issue to submit a talk
https://github.com/lnug/speakers
43 stars 7 forks source link

NodeJS Failure in Production - a Blameless PostMortem #69

Closed theninj4 closed 7 years ago

theninj4 commented 8 years ago

Failure happens. Are you ready for it? Here at Holiday Extras we've been running NodeJS in production for 5 years, it's been a critical part of our infrastructure for the last 3 years. In June 2015 we thought we had solved failure... then one day our API crumbled around us. We followed the incident with a Blameless PostMortem which forms the basis for this talk. I'm going to run through how we're using NodeJS in production, the countermeasures we had in place and a chronological run through of what happened, the metrics we care about, the mitigations and the solutions.

admataz commented 8 years ago

Sounds like it could be, with some good storytelling, an interesting deep dive into some code challenges that could provide helpful insights for us all. I would be interested to hear more.

theninj4 commented 8 years ago

I can't really tell you much about the bugs in question without ruining the talk for anyone who comes (stories suck when you know the ending). I will tell you they're not bugs in code we've written, but pertain to quirky / non-obvious behaviour in the NodeJS core.

I'd be presenting both stories as a timeline of their discovery through mitigation, understanding and resolution / workaround.

admataz commented 8 years ago

cool - sounds good - only thing I'd suggest, as a storytelling technique - is writing your initial description and talk title above with some more dramatic anticipation, as that is what will appear in the newsletter and website, and will be your hook to get people in to hear you.

iancrowther commented 8 years ago

@theninj4 - can you do March or April? I think we have an open slot next week!

theninj4 commented 8 years ago

I'm on holiday at the moment, getting back in the UK for Monday. Next week will be pushing it, April would be better? I'll tweak the talk description as per @admataz 's suggestion when I'm back :+1:

iancrowther commented 8 years ago

Great, April..

iancrowther commented 8 years ago

@theninj4 can I confirm this?

iancrowther commented 8 years ago

cc'd @simonmcmanus

theninj4 commented 8 years ago

@iancrowther I'm trying to get my hands on some screenshots of real metrics, it's taking slightly longer than expected. Can I get back to you late tonight / tomorrow?

theninj4 commented 8 years ago

Sorry @iancrowther I'm having to jump through a lot of hoops (read: trying to restore backups) to get some good screenshots and evidence to support the stories and make it really enjoyable to watch. Can we postpone?

iancrowther commented 8 years ago

Moved to May, can you confirm?

theninj4 commented 8 years ago

Yes, lets do it.

iancrowther commented 8 years ago

Locked & Loaded :-)

theninj4 commented 8 years ago

Do you guys mind if I tweak the description a little? Or is it too late?

admataz commented 8 years ago

@theninj4 - that should be fine - let us know when it's done.

theninj4 commented 8 years ago

@admataz - Done. It works much better with the single story instead of two :+1: