apollographql / apollo-ios

📱  A strongly-typed, caching GraphQL client for iOS, written in Swift.
https://www.apollographql.com/docs/ios/
MIT License
3.88k stars 726 forks source link

Memory leak from InterceptorRequestChain when ending the chain with returnValueAsync #3057

Closed marksvend closed 1 year ago

marksvend commented 1 year ago

Summary

I'm reporting a memory leak in the iOS Apollo library version 1.2.0 where the request chain does not get deallocated when there is a cache hit.

When a request chain finishes with returnValueAsync and does not call proceedAsync, the releaseManagedSelf never gets called. The retain cycle never gets broken, and the chain leaks. This typically happens in the success case of CacheReadInterceptor.returnCacheDataElseFetch.

I confirmed that adding a call to self.releaseManagedSelf from InterceptorRequestChain.returnValueAsync solves the memory leak: all of the chain instances get deallocated properly.

However, this solution would break the CacheReadInterceptor.returnCacheDataAndFetch case, which calls both returnValueAsync and proceedAsync. It wouldn't be right to call releaseManagedSelf prematurely before the chain has finished. There's no way to know when returnValueAsync is called whether that is the end or not.

There's another leak in the handleErrorAsync function because it does not call self.releaseManagedSelf, either. But that is always the end of the chain so it should be safe to release self there.

Version

1.2.0

Steps to reproduce the behavior

I discovered the problem using the memory graph debugger in Xcode after executing many queries in the app. As long as some of them return cached data, a large multi-node memory cycle will appear in the debugger.

Logs

▿ 68 elements
  - 0 : "0   ???                                 0x000000010d6ed4cc 0x0 + 4520334540"
  - 1 : "1   ???                                 0x000000010d6ed5a7 0x0 + 4520334759"
  - 2 : "2   AppName                                0x0000000100dc9050 main + 0"
  - 3 : "3   AppName                                0x0000000101ca6cd7 $s6Apollo23InterceptorRequestChainC12proceedAsync7request8response10completion11interceptoryAA11HTTPRequestCyxG_AA12HTTPResponseCyxGSgys6ResultOyAA13GraphQLResultVy4DataQzGs5Error_pGcAA0aB16ReentrantWrapperCt0A3API0N11QLOperationRzlF + 1335"
  - 4 : "4   AppName                                0x0000000101c62e6e $s6Apollo0A27InterceptorReentrantWrapperC12proceedAsync7request8response10completionyAA11HTTPRequestCyxG_AA12HTTPResponseCyxGSgys6ResultOyAA13GraphQLResultVy4DataQzGs5Error_pGct0A3API0M11QLOperationRzlF + 174"
  - 5 : "5   AppName                                0x0000000101c63331 $s6Apollo0A27InterceptorReentrantWrapperCAA12RequestChainA2aDP12proceedAsync7request8response10completionyAA11HTTPRequestCyqd__G_AA12HTTPResponseCyqd__GSgys6ResultOyAA13GraphQLResultVy4DataQyd__Gs5Error_pGct0A3API0O11QLOperationRd__lFTW + 17"
  - 6 : "6   AppName                                0x00000001013e0a38 $s18AppNameServiceClients25GraphQLMetricsInterceptorC14interceptAsync5chain7request8response10completiony6Apollo12RequestChain_p_AI11HTTPRequestCyxGAI12HTTPResponseCyxGSgys6ResultOyAI0D8QLResultVy4DataQzGs5Error_pGct0M3API0D11QLOperationRzlF6$deferL_yyA_A0_RzlF + 184"
  - 7 : "7   AppName                                0x00000001013e094e $s18AppNameServiceClients25GraphQLMetricsInterceptorC14interceptAsync5chain7request8response10completiony6Apollo12RequestChain_p_AI11HTTPRequestCyxGAI12HTTPResponseCyxGSgys6ResultOyAI0D8QLResultVy4DataQzGs5Error_pGct0M3API0D11QLOperationRzlF + 350"
  - 8 : "8   AppName                                0x00000001013e57d5 $s18AppNameServiceClients25GraphQLMetricsInterceptorC6Apollo0gF0AadEP14interceptAsync5chain7request8response10completionyAD12RequestChain_p_AD11HTTPRequestCyqd__GAD12HTTPResponseCyqd__GSgys6ResultOyAD0D8QLResultVy4DataQyd__Gs5Error_pGct0G3API0D11QLOperationRd__lFTW + 21"
  - 9 : "9   AppName                                0x0000000101c6350a $s6Apollo0A27InterceptorReentrantWrapperC14interceptAsync5chain7request8response10completionyAA12RequestChain_p_AA11HTTPRequestCyxGAA12HTTPResponseCyxGSgys6ResultOyAA13GraphQLResultVy4DataQzGs5Error_pGct0A3API0P11QLOperationRzlF + 314"
  - 10 : "10  AppName                                0x0000000101ca6bd8 $s6Apollo23InterceptorRequestChainC12proceedAsync7request8response10completion11interceptoryAA11HTTPRequestCyxG_AA12HTTPResponseCyxGSgys6ResultOyAA13GraphQLResultVy4DataQzGs5Error_pGcAA0aB16ReentrantWrapperCt0A3API0N11QLOperationRzlF + 1080"
  - 11 : "11  AppName                                0x0000000101c62e6e $s6Apollo0A27InterceptorReentrantWrapperC12proceedAsync7request8response10completionyAA11HTTPRequestCyxG_AA12HTTPResponseCyxGSgys6ResultOyAA13GraphQLResultVy4DataQzGs5Error_pGct0A3API0M11QLOperationRzlF + 174"
  - 12 : "12  AppName                                0x0000000101c63331 $s6Apollo0A27InterceptorReentrantWrapperCAA12RequestChainA2aDP12proceedAsync7request8response10completionyAA11HTTPRequestCyqd__G_AA12HTTPResponseCyqd__GSgys6ResultOyAA13GraphQLResultVy4DataQyd__Gs5Error_pGct0A3API0O11QLOperationRd__lFTW + 17"
  - 13 : "13  AppName                                0x0000000101c6e5b1 $s6Apollo21CacheWriteInterceptorV14interceptAsync5chain7request8response10completionyAA12RequestChain_p_AA11HTTPRequestCyxGAA12HTTPResponseCyxGSgys6ResultOyAA13GraphQLResultVy4DataQzGs5Error_pGct0A3API0P11QLOperationRzlF + 1249"
  - 14 : "14  AppName                                0x0000000101c6e8e7 $s6Apollo21CacheWriteInterceptorVAA0aD0A2aDP14interceptAsync5chain7request8response10completionyAA12RequestChain_p_AA11HTTPRequestCyqd__GAA12HTTPResponseCyqd__GSgys6ResultOyAA13GraphQLResultVy4DataQyd__Gs5Error_pGct0A3API0P11QLOperationRd__lFTW + 23"
  - 15 : "15  AppName                                0x0000000101c6350a $s6Apollo0A27InterceptorReentrantWrapperC14interceptAsync5chain7request8response10completionyAA12RequestChain_p_AA11HTTPRequestCyxGAA12HTTPResponseCyxGSgys6ResultOyAA13GraphQLResultVy4DataQzGs5Error_pGct0A3API0P11QLOperationRzlF + 314"
  - 16 : "16  AppName                                0x0000000101ca6bd8 $s6Apollo23InterceptorRequestChainC12proceedAsync7request8response10completion11interceptoryAA11HTTPRequestCyxG_AA12HTTPResponseCyxGSgys6ResultOyAA13GraphQLResultVy4DataQzGs5Error_pGcAA0aB16ReentrantWrapperCt0A3API0N11QLOperationRzlF + 1080"
  - 17 : "17  AppName                                0x0000000101c62e6e $s6Apollo0A27InterceptorReentrantWrapperC12proceedAsync7request8response10completionyAA11HTTPRequestCyxG_AA12HTTPResponseCyxGSgys6ResultOyAA13GraphQLResultVy4DataQzGs5Error_pGct0A3API0M11QLOperationRzlF + 174"
  - 18 : "18  AppName                                0x0000000101c63331 $s6Apollo0A27InterceptorReentrantWrapperCAA12RequestChainA2aDP12proceedAsync7request8response10completionyAA11HTTPRequestCyqd__G_AA12HTTPResponseCyqd__GSgys6ResultOyAA13GraphQLResultVy4DataQyd__Gs5Error_pGct0A3API0O11QLOperationRd__lFTW + 17"
  - 19 : "19  AppName                                0x00000001013e76e2 $s18AppNameServiceClients32GraphQLResponseHeaderInterceptorV14interceptAsync5chain7request8response10completiony6Apollo12RequestChain_p_AI11HTTPRequestCyxGAI12HTTPResponseCyxGSgys6ResultOyAI0D8QLResultVy4DataQzGs5Error_pGct0N3API0D11QLOperationRzlF + 1794"
  - 20 : "20  AppName                                0x00000001013e7df9 $s18AppNameServiceClients32GraphQLResponseHeaderInterceptorV6Apollo0hG0AadEP14interceptAsync5chain7request8response10completionyAD12RequestChain_p_AD11HTTPRequestCyqd__GAD12HTTPResponseCyqd__GSgys6ResultOyAD0D8QLResultVy4DataQyd__Gs5Error_pGct0H3API0D11QLOperationRd__lFTW + 9"
  - 21 : "21  AppName                                0x0000000101c6350a $s6Apollo0A27InterceptorReentrantWrapperC14interceptAsync5chain7request8response10completionyAA12RequestChain_p_AA11HTTPRequestCyxGAA12HTTPResponseCyxGSgys6ResultOyAA13GraphQLResultVy4DataQzGs5Error_pGct0A3API0P11QLOperationRzlF + 314"
  - 22 : "22  AppName                                0x0000000101ca6bd8 $s6Apollo23InterceptorRequestChainC12proceedAsync7request8response10completion11interceptoryAA11HTTPRequestCyxG_AA12HTTPResponseCyxGSgys6ResultOyAA13GraphQLResultVy4DataQzGs5Error_pGcAA0aB16ReentrantWrapperCt0A3API0N11QLOperationRzlF + 1080"
  - 23 : "23  AppName                                0x0000000101c62e6e $s6Apollo0A27InterceptorReentrantWrapperC12proceedAsync7request8response10completionyAA11HTTPRequestCyxG_AA12HTTPResponseCyxGSgys6ResultOyAA13GraphQLResultVy4DataQzGs5Error_pGct0A3API0M11QLOperationRzlF + 174"
  - 24 : "24  AppName                                0x0000000101c63331 $s6Apollo0A27InterceptorReentrantWrapperCAA12RequestChainA2aDP12proceedAsync7request8response10completionyAA11HTTPRequestCyqd__G_AA12HTTPResponseCyqd__GSgys6ResultOyAA13GraphQLResultVy4DataQyd__Gs5Error_pGct0A3API0O11QLOperationRd__lFTW + 17"
  - 25 : "25  AppName                                0x0000000101c6b972 $s6Apollo34AutomaticPersistedQueryInterceptorV14interceptAsync5chain7request8response10completionyAA12RequestChain_p_AA11HTTPRequestCyxGAA12HTTPResponseCyxGSgys6ResultOyAA13GraphQLResultVy4DataQzGs5Error_pGct0A3API0Q11QLOperationRzlF + 2402"
  - 26 : "26  AppName                                0x0000000101c6c0d9 $s6Apollo34AutomaticPersistedQueryInterceptorVAA0aE0A2aDP14interceptAsync5chain7request8response10completionyAA12RequestChain_p_AA11HTTPRequestCyqd__GAA12HTTPResponseCyqd__GSgys6ResultOyAA13GraphQLResultVy4DataQyd__Gs5Error_pGct0A3API0Q11QLOperationRd__lFTW + 9"
  - 27 : "27  AppName                                0x0000000101c6350a $s6Apollo0A27InterceptorReentrantWrapperC14interceptAsync5chain7request8response10completionyAA12RequestChain_p_AA11HTTPRequestCyxGAA12HTTPResponseCyxGSgys6ResultOyAA13GraphQLResultVy4DataQzGs5Error_pGct0A3API0P11QLOperationRzlF + 314"
  - 28 : "28  AppName                                0x0000000101ca6bd8 $s6Apollo23InterceptorRequestChainC12proceedAsync7request8response10completion11interceptoryAA11HTTPRequestCyxG_AA12HTTPResponseCyxGSgys6ResultOyAA13GraphQLResultVy4DataQzGs5Error_pGcAA0aB16ReentrantWrapperCt0A3API0N11QLOperationRzlF + 1080"
  - 29 : "29  AppName                                0x0000000101c62e6e $s6Apollo0A27InterceptorReentrantWrapperC12proceedAsync7request8response10completionyAA11HTTPRequestCyxG_AA12HTTPResponseCyxGSgys6ResultOyAA13GraphQLResultVy4DataQzGs5Error_pGct0A3API0M11QLOperationRzlF + 174"
  - 30 : "30  AppName                                0x0000000101c63331 $s6Apollo0A27InterceptorReentrantWrapperCAA12RequestChainA2aDP12proceedAsync7request8response10completionyAA11HTTPRequestCyqd__G_AA12HTTPResponseCyqd__GSgys6ResultOyAA13GraphQLResultVy4DataQyd__Gs5Error_pGct0A3API0O11QLOperationRd__lFTW + 17"
  - 31 : "31  AppName                                0x0000000101cad3ca $s6Apollo30JSONResponseParsingInterceptorV14interceptAsync5chain7request8response10completionyAA12RequestChain_p_AA11HTTPRequestCyxGAA12HTTPResponseCyxGSgys6ResultOyAA13GraphQLResultVy4DataQzGs5Error_pGct0A3API0P11QLOperationRzlF + 1370"
  - 32 : "32  AppName                                0x0000000101cad919 $s6Apollo30JSONResponseParsingInterceptorVAA0aD0A2aDP14interceptAsync5chain7request8response10completionyAA12RequestChain_p_AA11HTTPRequestCyqd__GAA12HTTPResponseCyqd__GSgys6ResultOyAA13GraphQLResultVy4DataQyd__Gs5Error_pGct0A3API0P11QLOperationRd__lFTW + 9"
  - 33 : "33  AppName                                0x0000000101c6350a $s6Apollo0A27InterceptorReentrantWrapperC14interceptAsync5chain7request8response10completionyAA12RequestChain_p_AA11HTTPRequestCyxGAA12HTTPResponseCyxGSgys6ResultOyAA13GraphQLResultVy4DataQzGs5Error_pGct0A3API0P11QLOperationRzlF + 314"
  - 34 : "34  AppName                                0x0000000101ca6bd8 $s6Apollo23InterceptorRequestChainC12proceedAsync7request8response10completion11interceptoryAA11HTTPRequestCyxG_AA12HTTPResponseCyxGSgys6ResultOyAA13GraphQLResultVy4DataQzGs5Error_pGcAA0aB16ReentrantWrapperCt0A3API0N11QLOperationRzlF + 1080"
  - 35 : "35  AppName                                0x0000000101c62e6e $s6Apollo0A27InterceptorReentrantWrapperC12proceedAsync7request8response10completionyAA11HTTPRequestCyxG_AA12HTTPResponseCyxGSgys6ResultOyAA13GraphQLResultVy4DataQzGs5Error_pGct0A3API0M11QLOperationRzlF + 174"
  - 36 : "36  AppName                                0x0000000101c63331 $s6Apollo0A27InterceptorReentrantWrapperCAA12RequestChainA2aDP12proceedAsync7request8response10completionyAA11HTTPRequestCyqd__G_AA12HTTPResponseCyqd__GSgys6ResultOyAA13GraphQLResultVy4DataQyd__Gs5Error_pGct0A3API0O11QLOperationRd__lFTW + 17"
  - 37 : "37  AppName                                0x0000000101cc3723 $s6Apollo23ResponseCodeInterceptorV14interceptAsync5chain7request8response10completionyAA12RequestChain_p_AA11HTTPRequestCyxGAA12HTTPResponseCyxGSgys6ResultOyAA13GraphQLResultVy4DataQzGs5Error_pGct0A3API0P11QLOperationRzlF + 611"
  - 38 : "38  AppName                                0x0000000101cc39d9 $s6Apollo23ResponseCodeInterceptorVAA0aD0A2aDP14interceptAsync5chain7request8response10completionyAA12RequestChain_p_AA11HTTPRequestCyqd__GAA12HTTPResponseCyqd__GSgys6ResultOyAA13GraphQLResultVy4DataQyd__Gs5Error_pGct0A3API0P11QLOperationRd__lFTW + 9"
  - 39 : "39  AppName                                0x0000000101c6350a $s6Apollo0A27InterceptorReentrantWrapperC14interceptAsync5chain7request8response10completionyAA12RequestChain_p_AA11HTTPRequestCyxGAA12HTTPResponseCyxGSgys6ResultOyAA13GraphQLResultVy4DataQzGs5Error_pGct0A3API0P11QLOperationRzlF + 314"
  - 40 : "40  AppName                                0x0000000101ca6bd8 $s6Apollo23InterceptorRequestChainC12proceedAsync7request8response10completion11interceptoryAA11HTTPRequestCyxG_AA12HTTPResponseCyxGSgys6ResultOyAA13GraphQLResultVy4DataQzGs5Error_pGcAA0aB16ReentrantWrapperCt0A3API0N11QLOperationRzlF + 1080"
  - 41 : "41  AppName                                0x0000000101c62e6e $s6Apollo0A27InterceptorReentrantWrapperC12proceedAsync7request8response10completionyAA11HTTPRequestCyxG_AA12HTTPResponseCyxGSgys6ResultOyAA13GraphQLResultVy4DataQzGs5Error_pGct0A3API0M11QLOperationRzlF + 174"
  - 42 : "42  AppName                                0x0000000101c63331 $s6Apollo0A27InterceptorReentrantWrapperCAA12RequestChainA2aDP12proceedAsync7request8response10completionyAA11HTTPRequestCyqd__G_AA12HTTPResponseCyqd__GSgys6ResultOyAA13GraphQLResultVy4DataQyd__Gs5Error_pGct0A3API0O11QLOperationRd__lFTW + 17"
  - 43 : "43  AppName                                0x00000001013e5c79 $s18AppNameServiceClients25GraphQLLatencyTimerFinishC14interceptAsync5chain7request8response10completiony6Apollo12RequestChain_p_AI11HTTPRequestCyxGAI12HTTPResponseCyxGSgys6ResultOyAI0D8QLResultVy4DataQzGs5Error_pGct0N3API0D11QLOperationRzlF + 249"
  - 44 : "44  AppName                                0x00000001013e6075 $s18AppNameServiceClients25GraphQLLatencyTimerFinishC6Apollo0H11InterceptorAadEP14interceptAsync5chain7request8response10completionyAD12RequestChain_p_AD11HTTPRequestCyqd__GAD12HTTPResponseCyqd__GSgys6ResultOyAD0D8QLResultVy4DataQyd__Gs5Error_pGct0H3API0D11QLOperationRd__lFTW + 21"
  - 45 : "45  AppName                                0x0000000101c6350a $s6Apollo0A27InterceptorReentrantWrapperC14interceptAsync5chain7request8response10completionyAA12RequestChain_p_AA11HTTPRequestCyxGAA12HTTPResponseCyxGSgys6ResultOyAA13GraphQLResultVy4DataQzGs5Error_pGct0A3API0P11QLOperationRzlF + 314"
  - 46 : "46  AppName                                0x0000000101ca6bd8 $s6Apollo23InterceptorRequestChainC12proceedAsync7request8response10completion11interceptoryAA11HTTPRequestCyxG_AA12HTTPResponseCyxGSgys6ResultOyAA13GraphQLResultVy4DataQzGs5Error_pGcAA0aB16ReentrantWrapperCt0A3API0N11QLOperationRzlF + 1080"
  - 47 : "47  AppName                                0x0000000101c62e6e $s6Apollo0A27InterceptorReentrantWrapperC12proceedAsync7request8response10completionyAA11HTTPRequestCyxG_AA12HTTPResponseCyxGSgys6ResultOyAA13GraphQLResultVy4DataQzGs5Error_pGct0A3API0M11QLOperationRzlF + 174"
  - 48 : "48  AppName                                0x0000000101c63331 $s6Apollo0A27InterceptorReentrantWrapperCAA12RequestChainA2aDP12proceedAsync7request8response10completionyAA11HTTPRequestCyqd__G_AA12HTTPResponseCyqd__GSgys6ResultOyAA13GraphQLResultVy4DataQyd__Gs5Error_pGct0A3API0O11QLOperationRd__lFTW + 17"
  - 49 : "49  AppName                                0x0000000101cb98f3 $s6Apollo23NetworkFetchInterceptorC14interceptAsync5chain7request8response10completionyAA12RequestChain_p_AA11HTTPRequestCyxGAA12HTTPResponseCyxGSgys6ResultOyAA13GraphQLResultVy4DataQzGs5Error_pGct0A3API0P11QLOperationRzlFyARy10FoundationAUV_So17NSHTTPURLResponseCtsAX_pGcfU_ + 1315"
  - 50 : "50  AppName                                0x0000000101cb9a3c $s6Apollo23NetworkFetchInterceptorC14interceptAsync5chain7request8response10completionyAA12RequestChain_p_AA11HTTPRequestCyxGAA12HTTPResponseCyxGSgys6ResultOyAA13GraphQLResultVy4DataQzGs5Error_pGct0A3API0P11QLOperationRzlFyARy10FoundationAUV_So17NSHTTPURLResponseCtsAX_pGcfU_TA + 76"
  - 51 : "51  AppName                                0x0000000101cc9318 $s6Apollo16URLSessionClientC10urlSession_4task20didCompleteWithErrorySo12NSURLSessionC_So0K4TaskCs0J0_pSgtF + 1240"
  - 52 : "52  AppName                                0x0000000101cc9666 $s6Apollo16URLSessionClientC10urlSession_4task20didCompleteWithErrorySo12NSURLSessionC_So0K4TaskCs0J0_pSgtFTo + 134"
  - 53 : "53  CFNetwork                           0x00007ff804105dbd CFNetwork + 56765"
  - 54 : "54  Foundation                          0x00007ff800bd3d1a __NSBLOCKOPERATION_IS_CALLING_OUT_TO_A_BLOCK__ + 7"
  - 55 : "55  Foundation                          0x00007ff800bd3c12 -[NSBlockOperation main] + 98"
  - 56 : "56  Foundation                          0x00007ff800bd6c17 __NSOPERATION_IS_INVOKING_MAIN__ + 17"
  - 57 : "57  Foundation                          0x00007ff800bd2e7f -[NSOperation start] + 782"
  - 58 : "58  Foundation                          0x00007ff800bd755d __NSOPERATIONQUEUE_IS_STARTING_AN_OPERATION__ + 17"
  - 59 : "59  Foundation                          0x00007ff800bd70ab __NSOQSchedule_f + 182"
  - 60 : "60  libdispatch.dylib                   0x000000010c07b0fd _dispatch_block_async_invoke2 + 83"
  - 61 : "61  libdispatch.dylib                   0x000000010c0697ec _dispatch_client_callout + 8"
  - 62 : "62  libdispatch.dylib                   0x000000010c06ca44 _dispatch_continuation_pop + 836"
  - 63 : "63  libdispatch.dylib                   0x000000010c06b806 _dispatch_async_redirect_invoke + 997"
  - 64 : "64  libdispatch.dylib                   0x000000010c07e8d7 _dispatch_root_queue_drain + 414"
  - 65 : "65  libdispatch.dylib                   0x000000010c07f57c _dispatch_worker_thread2 + 278"
  - 66 : "66  libsystem_pthread.dylib             0x00007ff837749c0f _pthread_wqthread + 257"
  - 67 : "67  libsystem_pthread.dylib             0x00007ff837748bbf start_wqthread + 15"

Anything else?

No response

calvincestari commented 1 year ago

Hi @marksvend, thanks for the issue and included debugging analysis. It certainly looks like there are paths out of the interceptor request chain that aren't being cleaned up properly. We'll take a look and resolve asap.

BrentMifsud commented 1 year ago

I've also been noticing a number of crashes with request chains that started popping up recently as well. Not sure if they are related to this, but here is a stack trace:

Note I have not been able to reproduce these crashes. But we have definitely seen an uptick of these crashes since upgrading to 1.2.0.

Crashed: com.apollographql.ApolloStore
0  libobjc.A.dylib                0x1cf4 objc_msgSend + 20
1  EnvoyMobile                    0xcbddc0 InterceptorRequestChain.proceedAsync<A>(request:response:completion:interceptor:) + 25 (Atomic.swift:25)
2  EnvoyMobile                    0xc96b54 ApolloInterceptorReentrantWrapper.proceedAsync<A>(request:response:completion:) + 38 (ApolloInterceptorReentrantWrapper.swift:38)
3  EnvoyMobile                    0xc9ccd0 closure #1 in CacheReadInterceptor.interceptAsync<A>(chain:request:response:completion:) + 54 (CacheReadInterceptor.swift:54)
4  EnvoyMobile                    0xc9d1c4 partial apply for closure #1 in CacheReadInterceptor.fetchFromCache<A>(for:chain:completion:) + 99 (CacheReadInterceptor.swift:99)
5  EnvoyMobile                    0xc9f074 static OS_dispatch_queue.returnResultAsyncIfNeeded<A>(on:action:result:) + 22 (DispatchQueue+Optional.swift:22)
6  EnvoyMobile                    0xc97b1c closure #1 in ApolloStore.withinReadTransaction<A>(_:callbackQueue:completion:) + 113 (ApolloStore.swift:113)
7  EnvoyMobile                    0xc94d5c thunk for @escaping @callee_guaranteed () -> () + 265212 (<compiler-generated>:265212)
8  libdispatch.dylib              0x63094 _dispatch_call_block_and_release + 24
9  libdispatch.dylib              0x64094 _dispatch_client_callout + 16
10 libdispatch.dylib              0x6bb8 _dispatch_continuation_pop$VARIANT$mp + 440
11 libdispatch.dylib              0x62e0 _dispatch_async_redirect_invoke + 588
12 libdispatch.dylib              0x13b94 _dispatch_root_queue_drain + 340
13 libdispatch.dylib              0x1439c _dispatch_worker_thread2 + 172
14 libsystem_pthread.dylib        0x1dc4 _pthread_wqthread + 224
15 libsystem_pthread.dylib        0x192c start_wqthread + 8

image

marksvend commented 1 year ago

Yes, we are also seeing a crash from this code. I was just about to report it.

Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Subtype: KERN_INVALID_ADDRESS at 0x0000000000000010
Exception Codes: 0x0000000000000001, 0x0000000000000010
VM Region Info: 0x10 is not in any region.  Bytes before following region: 68719476720
      REGION TYPE                 START - END      [ VSIZE] PRT/MAX SHRMOD  REGION DETAIL
      UNUSED SPACE AT START
--->  
      commpage (reserved)     1000000000-7000000000 [384.0G] ---/--- SM=NUL  ...(unallocated)
Triggered by Thread:  0

Kernel Triage:
VM - (arg = 0x0) pmap_enter retried due to resource shortage
VM - (arg = 0x0) pmap_enter retried due to resource shortage
VM - (arg = 0x0) pmap_enter retried due to resource shortage
VM - (arg = 0x0) pmap_enter retried due to resource shortage
VM - (arg = 0x0) pmap_enter retried due to resource shortage

Thread 0 Crashed:
0   AppName                             0x00000001053920d0 specialized Atomic.wrappedValue.getter + 0 (Atomic.swift:23)
1   AppName                             0x00000001053920d0 InterceptorRequestChain.isCancelled.getter + 4 (InterceptorRequestChain.swift:0)
2   AppName                             0x00000001053920d0 InterceptorRequestChain.proceedAsync<A>(request:response:completion:interceptor:) + 240 (InterceptorRequestChain.swift:119)
3   AppName                             0x000000010536b2b0 ApolloInterceptorReentrantWrapper.proceedAsync<A>(request:response:completion:) + 76 (ApolloInterceptorReentrantWrapper.swift:38)
4   AppName                             0x0000000104e7faf4 closure #1 in GraphQLWeblabInterceptor.interceptAsync<A>(chain:request:response:completion:) + 1624 (GraphQLWeblabInterceptor.swift:52)
5   AppName                             0x0000000104e7fd81 partial apply for closure #1 in GraphQLWeblabInterceptor.interceptAsync<A>(chain:request:response:completion:) + 1 (<compiler-generated>:0)
6   AppName                             0x0000000104e8131d specialized thunk for @escaping @callee_guaranteed @Sendable @async () -> (@out A) + 1
7   AppName                             0x0000000104fe55cd partial apply for specialized thunk for @escaping @callee_guaranteed @Sendable @async () -> (@out A) + 1 (<compiler-generated>:0)
8   libswift_Concurrency.dylib      0x00000001ad589dd9 completeTaskWithClosure(swift::AsyncContext*, swift::SwiftError*) + 1 (Task.cpp:496)
sgharraph commented 1 year ago

We are also crashing with our prod app after upgrading to 1.2.0:

Crashed: com.apple.main-thread
0  EVgoCharger                    0x64ac9c InterceptorRequestChain.proceedAsync<A>(request:response:completion:interceptor:) + 23 (Atomic.swift:23)
1  EVgoCharger                    0x623e7c ApolloInterceptorReentrantWrapper.proceedAsync<A>(request:response:completion:) + 38 (ApolloInterceptorReentrantWrapper.swift:38)
2  EVgoCharger                    0x14e48c closure #1 in AuthorizationInterceptor.interceptAsync<A>(chain:request:response:completion:) + 36 (AuthorizationInterceptor.swift:36)
3  EVgoCharger                    0x14eb38 partial apply for closure #1 in AuthorizationInterceptor.interceptAsync<A>(chain:request:response:completion:) + 4369460024 (<compiler-generated>:4369460024)
4  EVgoCharger                    0x14ea40 thunk for @escaping @callee_guaranteed (@guaranteed FIRAuthTokenResult?, @guaranteed Error?) -> () + 4369459776 (<compiler-generated>:4369459776)
5  libdispatch.dylib              0x2320 _dispatch_call_block_and_release + 32
6  libdispatch.dylib              0x3eac _dispatch_client_callout + 20
7  libdispatch.dylib              0x126a4 _dispatch_main_queue_drain + 928
8  libdispatch.dylib              0x122f4 _dispatch_main_queue_callback_4CF + 44
9  CoreFoundation                 0x98d18 __CFRUNLOOP_IS_SERVICING_THE_MAIN_DISPATCH_QUEUE__ + 16
10 CoreFoundation                 0x7a650 __CFRunLoopRun + 1992
11 CoreFoundation                 0x7f4dc CFRunLoopRunSpecific + 612
12 GraphicsServices               0x135c GSEventRunModal + 164
13 UIKitCore                      0x39d37c -[UIApplication _run] + 888
14 UIKitCore                      0x39cfe0 UIApplicationMain + 340
15 EVgoCharger                    0x7424 main + 6 (main.swift:6)
16 ???                            0x1bc33cdec (Missing)
calvincestari commented 1 year ago

I'll be taking a look into this one today.

I'm curious about the versioning though since all of the bug reports here indicate "update to 1.2.0". Substantial changes to the interceptor chain were released in 1.1.0 to support multipart HTTP based subscriptions. So if you're upgrading from a version < 1.1.0 to 1.2.0 then I can see the link but if you had been using anything > 1.1.0 prior to the 1.2.0 upgrade then it may indicate something else.

sgharraph commented 1 year ago

@calvincestari I updated from 1.0.6 to 1.2.0

marksvend commented 1 year ago

We also upgraded from 1.0.6 to 1.2.0

calvincestari commented 1 year ago

@marksvend @sgharraph - do either of you use custom interceptors in your request chain?

marksvend commented 1 year ago

We do use a few custom interceptors

sgharraph commented 1 year ago

We also use custom interceptors. You can see that from the stack trace I attached

rickpasetto commented 1 year ago

+1. We are also seeing a crash with a stack trace very similar when updating from 1.0.5 to 1.2.1, and we also use custom interceptors:

#0  0x0000000116398bcc in InterceptorRequestChain.isCancelled.getter ()
#1  0x000000011639a04c in InterceptorRequestChain.proceedAsync<τ_0_0>(request:response:completion:interceptor:) at /Users/rick.pasetto/Documents/GitHub/swift/Apps/Tuist/Dependencies/SwiftPackageManager/.build/checkouts/apollo-ios/Sources/Apollo/InterceptorRequestChain.swift:119
#2  0x000000011635902c in ApolloInterceptorReentrantWrapper.proceedAsync<τ_0_0>(request:response:completion:) at /Users/rick.pasetto/Documents/GitHub/swift/Apps/Tuist/Dependencies/SwiftPackageManager/.build/checkouts/apollo-ios/Sources/Apollo/ApolloInterceptorReentrantWrapper.swift:38
#3  0x00000001163594b4 in protocol witness for RequestChain.proceedAsync<τ_0_0>(request:response:completion:) in conformance ApolloInterceptorReentrantWrapper ()
#4  0x00000001177f0ac8 in closure #1 in AuthInterceptor.interceptAsync<τ_0_0>(chain:request:response:completion:) at /Users/rick.pasetto/Documents/GitHub/swift/Apps/Targets/ShuttleNetworking/Sources/ApolloInterceptors.swift:30
#5  0x00000001177f0de4 in partial apply for closure #1 in AuthInterceptor.interceptAsync<τ_0_0>(chain:request:response:completion:) ()
#6  0x00000001177efe4c in thunk for @escaping @callee_guaranteed @Sendable @async () -> (@out τ_0_0) ()
#7  0x00000001177eff8c in partial apply for thunk for @escaping @callee_guaranteed @Sendable @async () -> (@out τ_0_0) ()
calvincestari commented 1 year ago

This has been a tricky issue to resolve given how we chose to implement HTTP-based subscriptions while maintaining 100% backwards compatibility for existing request chains. I don't think the bugs are related to custom interceptors necessarily but rather how we had to use an Unmanaged instance of InterceptorRequestChain.

I discarded the first solution (#3065) because it required authors of custom interceptors to know too much internal knowledge of how the request chain worked and how it managed the unmanaged instance.

I have another solution up at #3070 which removes the unmanaged instance and goes back to the old memory model for the request chain, while still supporting HTTP-based subscriptions, but at the cost of being a minor breaking change and requiring an update to custom interceptors. I'm busy with the last set of tests for that PR and then adding a migration guide which will detail the changes needed.

calvincestari commented 1 year ago

The fix for this has been merged into the release/1.3 branch which will be the next release; date unknown at the moment though.

Our docs system doesn't render changes unless the branch is built against main but you can still read about the migration that will be needed in markdown here.

Since the solution completely removed the Unmanaged object I'm confident it'll resolve both the leaks and crash bugs that were reported in this issue. Until the official release though it would be helpful if you were able to test your projects against the release/1.3 branch to confirm.

rickpasetto commented 1 year ago

@calvincestari Great! That seems to have done the trick. Looking forward to official release of 1.3!