Closed raRaRa closed 6 years ago
Also getting this error. Server appears down. Getting 502's from Cloudflare.
Some roadworks from Grand Paris have damaged the fibre cable connecting our datacenters, without any cutoff yet but situation is highly unstable. Tomorrow the contractor will start reparations on the cable by replacing the broken section of the cable. For now I will try to switch to our backup servers.
Switched to Roubaix DC. Performance will be impacted, but should be more stable now.
I will need to evaluate why our monitoring didn't catch this, thanks for reporting the issues!
Thank you for this awesome service and for your quick response @andrieslouw.
I would also like to thank you for the incredible quick response and fix.
I have two questions, not really related to this issue but here it goes: I've been thinking about hosting imagesweserv in AWS. Do you have any rough cost estimate for hosting the service with proper CDN in front, e.g. Cloudflare or Cloudfront? What are you paying? :)
And my last question, do you guys earn anything from this service, and is it just going to be free forever? :)
Thanks!
As I had a similar question earlier, I may need to update our wiki, but let me recap; We never accepted payments, and all our support is best-effort. Our focus is helping small start-ups, self-employed, and personal websites with something that should be easy nowadays (resizing images on-the-fly). I started writing the original script around 2005, written in PHP 3. In those days you wouldn't even try to e-mail full-size pictures, unless you got invited to Gmail.
It's 2017 now, people live-stream 4K content for next to nothing, and you just need to know your way around the internet. You'll only need a small PHP-FPM/Nginx stack, a modern multicore processor, 32GB of DDR4, and 1 Gbit/s of internet connectivity to resize 200 million images/month. Be careful when using VPS-es, nothing beats the shear I/O and low latencies of bare-metal. CloudFlare caches around ~75% of all our requests, but the main benefit is their global presence and CDN, it really keeps network performance to clients in check, because TCP/IP is really shitty when coping with high latencies and lossy long distance connections (eg. mobile clients).
Our main concern is bandwidth; while we buy data transfer quite cheap (<€1 per terabyte), our connection is limited to prevent overusage of the free service. I would advise you to look into OVH.fr (Kimsufi), Online.net or Hetzner.de (Robot sale) for cheap offers on dedicated servers with plenty of bandwidth & processing power.
Everybody should feel free to experiment with our free service, try our code, and shoot us a question or two if they run into something strange. And please be kind for our other users, because kindness and joint effort keeps the internet floating since 1969, and probably will do so for many years to come.
I've found the root cause of our monitoring problems: While we do monitor our links, communication with CloudFlare and between servers is IPv6-preferred. It looks like the IPv6 stack flapped, causing CloudFlare and gateways to go haywire. Monitoring didn't spot any issues, because it has IPv4-preference on the same links. The resulting drop in requests didn't trigger alarms, and most of the errors where between CloudFlare and our servers, which aren't reported.
I also discovered that "bad links" are not detected, only failing links are. So some packetloss seems acceptable for the monitoring. Will keep the service on the Roubaix untill all fibers are fully restored, and will look into suitable monitoring solutions like smokeping to prevent this in the future..
Many thanks for the quick ticket, it saved the day!
Fibers have been checked, additional monitoring in place, switched service back to Paris DC. Will need to write some additional code to piece all monitoring together, but performance should be back in check now. I'm still investigating the IPv6 specific issues, to keep them from happening again.
Reopening this issue, as it occured again today from 07:00 (UTC) till 09:05.
Will track progress in #157, closing again. :smile:
I get this error back: Error 502 Ray ID: 3ac221117f3c0a6c • 2017-10-11 13:22:36 UTC Bad gateway
CloudFlare isn't able to reach your server.