camdram / camdram

The portal website for student theatre in Cambridge
https://www.camdram.net/
GNU General Public License v2.0
21 stars 15 forks source link

Graceful handling of external request timeouts and errors #512

Open CHTJonas opened 6 years ago

CHTJonas commented 6 years ago

The /usr/local/bin/camdram-console camdram:entities:social-news --env=prod > /dev/null cron job will very occasionally timeout and error, outputting something along the lines of:

In TwitterOAuth.php line 551:

  Operation timed out after 5000 milliseconds with 0 bytes received

We should probably look at catching the error, pausing for a few seconds, retrying up to twice and then exiting. The above console task is triggered every so-often and HTTP timeouts to the Twitter API are rare so this is not a particularly high priority bug.

hoyes commented 5 years ago

Alternatively, it looks like it might be possible to create a long-lived streaming connection for specific user ids: https://developer.twitter.com/en/docs/tweets/filter-realtime/api-reference/post-statuses-filter.html

hoyes commented 5 years ago

I've just downloaded and tested https://github.com/dghubble/go-twitter/blob/master/examples/streaming.go with the Camdram API keys and it works, and that example could probably be adapted for our use without much effort. The stream reports deleted tweets too (for #525).

Wondering about reimplementing this as a standalone service that:

Camdram code can then proxy this service from the home page and society/venue pages to generate the output.

Plus PHP's never been great for long-running processes so might be a nice opportunity to diversify the backend stack a little (and #548 might have similar requirements too)...

CHTJonas commented 5 years ago

Ooo interesting! I've been toying around with Golang myself a tiny bit and it looks quite nice, especially as it complies to binary but has lovely standard libraries like Ruby or Python (can also cross-compile for different architectures super easily). I worry a bit about running too much stuff on the server and running out of RAM, bearing in mind we're also running a database and a search engine on the JVM there. Maybe I'm being over-cautious...

The above does sound really nice! I think the general direction that modern SaaS offerings (which I guess is basically what Camdram is) tend to be heading is towards microservices. Whether we care about that much given we're tiny compared to most remains to be seen, but it's definitely worth discussing! Personally I do like the idea of offloading Social Media handling (and maybe the GitHub stuff on the dev page) to a separate service so that code is a bit more compartmentalised.

hoyes commented 5 years ago

I'm a fan of Go too, but coming from the C++ direction - it kinda hits a sweet spot between the interpreted and compiled worlds.

Yea I'm a bit concerned about RAM too. My thinking on this was that the amount of RAM required to hold all recent news items in memory in Go should be less than the RAM required to render the home page news feed in PHP at moment, so this service alone shouldn't significantly affect the memory footprint. But should probably measure and verify.

I tried to resist referring to "microservices" above, given there's a bit of an ideology around the concept which probably isn't so relevant for us as you say - it mainly comes from the need to scale workloads amongst massive engineering teams. There is a bit of a trend though towards long-lived processes on the backend, real-time data processing and single-page apps which I reckon are bandwagons we should be thinking about... We can perhaps avoid some of the complexity bloat of microservices by keeping everything in the same repo, Travis build and deployment.

I reckon if we were to pick one additional language on the backend for the next 5 years, Go is pretty decent bet - it was very prevalent in my recent job hunt... but open to other suggestions (I considered NodeJS?). We might need to think about something like #87 alongside this to ensure that it remains fairly straightforward to create a dev setup...

CHTJonas commented 5 years ago

My personal opinion? I would vote Go above Node. The Twitter streaming API is basically a super-long-lived HTTP connection right? I'd try and go for something that has a low resource footprint. As you say @hoyes, I think we should pick what feels right for 'us' RE development effort etc. and not what is best for eg. Twitter, Netflix or Facebook.

CHTJonas commented 5 years ago

Looks like it's not just the cron job that can timeout: CAMDRAM-WEB-89

philosophicles commented 5 years ago

The originally-mentioned cron job is getting much more frequent timeouts recently - many per day.