Chatie / server

Cloud Management Service for Chatie
https://www.chatie.io
Apache License 2.0
3 stars 2 forks source link

Chatie.io server down #55

Closed huan closed 3 years ago

huan commented 3 years ago

"You can maintain around 6,000 open connections on a single dyno Creating more than 160 connections/sec will cause H11 errors (backlog too deep)" — http://veldstra.org/2013/10/25/heroku-websocket-performance-test.html

"There are no limitations on the free dyno tier in terms of resources. In terms of concurrent connections there is a theoretical limit of 50 connections per Heroku router instance but we don't currently publicise the number of router instances running at any one time. In general, for the EU region you can expect to have around 1500 connections available at any given time." --staff reply to me. Later he clarified, this limit is for all dyno types. even paid ones. – Sourav Ghosh Sep 20 '16 at 14:19 — https://stackoverflow.com/a/25154488/1123955

2021-04-14T14:11:03.885895+00:00 heroku[router]: at=error code=H11 desc="Backlog too deep" method=GET path="/v0/websocket" host=api.chatie.io request_id=59ec8c0c-aba6-4a7a-a7a1-c1d6ff798fc9 fwd="52.83.49.48,108.162.215.120" dyno= connect= service= status=503 bytes= protocol=http

2021-04-14T14:12:01.619098+00:00 heroku[router]: at=error code=H10 desc="App crashed" method=GET path="/v0/websocket" host=api.chatie.io request_id=d0fa3403-5d6f-4f42-a2f9-e81de0bfb6c3 fwd="52.82.109.225,108.162.215.108" dyno= connect= service= status=503 bytes= protocol=http

image

image

image

huan commented 3 years ago

Problem

The Heroku Dyno fail with error H11 of a sudden.

In the past, we can connect 3000+ WebSocket connections with a free dyno.

From this week, it seems that we need 4 paid pro dynos to prevent the H11 error.

Solution

We use 4 dynos for our service now.

This is a workaround.

Issues

Because the service is not designed for the horizon scale, so the query will fail 3 times with 1-time success on average.

01:00:20 VERB Wechaty wechatifyUserModules(Puppet#0<PuppetService>(ding-dong-bot))
01:00:20 VERB PuppetService start()
01:00:20 VERB StateSwitch <PuppetService> on(pending) <- (false)
01:00:20 VERB PuppetService startGrpcClient()
01:00:20 VERB PuppetService discoverServiceIp(puppet_donut_XXX)
01:00:20 WARN No endpoint when starting grpc client, 10 retry left. Reconnecting in 10 seconds... 
01:00:30 VERB PuppetService discoverServiceIp(puppet_donut_XXX)
01:00:32 WARN No endpoint when starting grpc client, 9 retry left. Reconnecting in 10 seconds... 
01:00:42 VERB PuppetService discoverServiceIp(puppet_donut_XXX)
01:00:43 WARN No endpoint when starting grpc client, 8 retry left. Reconnecting in 10 seconds... 
01:00:53 VERB PuppetService discoverServiceIp(puppet_donut_XXX)
01:00:54 WARN No endpoint when starting grpc client, 7 retry left. Reconnecting in 10 seconds... 
01:01:04 VERB PuppetService discoverServiceIp(puppet_donut_XXX)
01:01:05 WARN No endpoint when starting grpc client, 6 retry left. Reconnecting in 10 seconds... 
01:01:15 VERB PuppetService discoverServiceIp(puppet_donut_XXX)
01:01:15 VERB PuppetService startGrpcStream()
01:01:16 VERB PuppetService onGrpcStreamEvent({type:EVENT_TYPE_SCAN(22), payload:"{"qrcode":"http://weixin.qq.com/x/AddMKwOGpxHRGRQNPWh5","status":2}"})
huan commented 3 years ago

We need nginx-proxy to support more worker_connections from this PR: https://github.com/nginx-proxy/nginx-proxy/pull/973

huan commented 3 years ago

Link to https://github.com/wechaty/wechaty.js.org/pull/786

huan commented 3 years ago

Posted at https://wechaty.js.org/2021/04/15/chatie-api-server-down/