vapor / apns

Helpful extensions and abstractions for using APNSwift
MIT License
115 stars 30 forks source link

Connection timed out deadlock issue #28

Closed madsodgaard closed 3 years ago

madsodgaard commented 3 years ago

Describe the bug

We are trying to send push notifications with APNS from a queues job. However, we are running into issues with the following error:

Connection request timed out. This might indicate a connection deadlock in your application. If you're running long running requests, consider increasing your connection timeout. (AsyncKit/ConnectionPool/EventLoopConnectionPool.swift:213)

\ After which all other attempts to send another push notification returns the same error. It is almost as if a pipe is clocked up by a request that never completes and releases the connection.

Example

In my example I am dequeuing two jobs Job A and Job B. Job A is sending a push notification to a token and Job B is sending a push notification.

Job A is ran first:

2021-01-25T14:11:17.210+01:00   [ codes.vapor.application ] [ DEBUG ] Send - starting up (APNSwift/APNSwiftConnection.swift:175)
2021-01-25T14:11:17.210+01:00   [ codes.vapor.application ] [ INFO ] Send - sending (APNSwift/APNSwiftConnection.swift:206)
2021-01-25T14:11:17.211+01:00   [ codes.vapor.application ] [ DEBUG ] Request - building (APNSwift/APNSwiftRequestEncoder.swift:70)
2021-01-25T14:11:17.211+01:00   [ codes.vapor.application ] [ TRACE ] Request - built (APNSwift/APNSwiftRequestEncoder.swift:103)
2021-01-25T14:11:17.211+01:00   [ codes.vapor.application ] [ TRACE ] Request - sent (APNSwift/APNSwiftRequestEncoder.swift:107)

as we can see the request is sent.

Now comes Job B right after to run. The environment has 1 eventloop, therefore 1 connection in the pool used by APNS. Therefor this request is put on the waitlist for the other one to finish.

2021-01-25T14:11:17.870+01:00   [ codes.vapor.application ] [ DEBUG ] Connection pool exhausted on this event loop, adding request to waitlist (AsyncKit/ConnectionPool/EventLoopConnectionPool.swift:207)

\ Exactly 10 seconds later (the standard connectionPoolTimeout) this error occurs:

2021-01-25T14:11:27.871+01:00   [ codes.vapor.application ] [ ERROR ] Connection request timed out. This might indicate a connection deadlock in your application. If you're running long running requests, consider increasing your connection timeout. (AsyncKit/ConnectionPool/EventLoopConnectionPool.swift:213)

\ All subsequent attempts to send a notification returns the same error, it is as if Job A is still using the connection, so others can't.

Potential fixes

I can't think of any other solution than to increase the connection pool timeout, but still 10 seconds is a long time. I don't understand why the response is not returned within that period, even if it's a failure. I would also expect the library to release the connection if it timed out.

Note: This error is not always reproducible, but it happens quite often (daily) and it requires us to restart our servers (to unblock the connection), so it's kind of a big issue.

madsodgaard commented 3 years ago

@kylebrowning Do you have an idea about this issue? It's really making it hard for us to use this library in production, as we constantly have to monitor the logs and restart our servers if the connection timed out.

kylebrowning commented 3 years ago

Unsure. Last time I looked into this it's an issue on apples end.

madsodgaard commented 3 years ago

Would it be possible to implement a timeout function, so that if it has not received a response within a certain time period it releases the connection to the pool again, so it stops blocking other requests?

Also, what's the status of #24 - so we perhaps could try with a longer timeout?

kylebrowning commented 3 years ago

That sounds really possible, unfortunately Ive got a done of day job work right now, and my weekends are filled. Are you able to try that? Its probably doing to be a kylebrowning/APNSwift thing instead of here.

I dont think #24 will help you because when I researched this Apple just never returns anything.

madsodgaard commented 3 years ago

I created a PR for it here: https://github.com/kylebrowning/APNSwift/pull/101

kylebrowning commented 3 years ago

This was resolved upstream in version 2.3.0