BingAds / BingAds-Java-SDK

Other
43 stars 47 forks source link

refreshTokensIfNeeded does not timeout or retry if bing server does not answer #106

Closed alegrin closed 5 years ago

alegrin commented 5 years ago

Today we found one of our application hung for more than an hour, after requesting a thread dump it was waiting for a refresh of the bing token (another was previously retrieved). You can check the stack-trace below.

   java.lang.Thread.State: RUNNABLE
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
    at java.net.SocketInputStream.read(SocketInputStream.java:171)
    at java.net.SocketInputStream.read(SocketInputStream.java:141)
    at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
    at sun.security.ssl.InputRecord.readV3Record(InputRecord.java:593)
    at sun.security.ssl.InputRecord.read(InputRecord.java:529)
    at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:983)
    - locked <0x00000006f3f5aa10> (a java.lang.Object)
    at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1385)
    - locked <0x00000006f3f5aa28> (a java.lang.Object)
    at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1413)
    at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1397)
    at org.apache.http.conn.ssl.SSLSocketFactory.createLayeredSocket(SSLSocketFactory.java:573)
    at org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:557)
    at org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:414)
    at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180)
    at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:326)
    at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:610)
    at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:445)
    at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:835)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
    at com.microsoft.bingads.internal.HttpClientWebServiceCaller.post(HttpClientWebServiceCaller.java:31)
    at com.microsoft.bingads.internal.UriOAuthService.getAccessTokens(UriOAuthService.java:66)
    at com.microsoft.bingads.internal.OAuthWithAuthorizationCode.requestAccessAndRefreshTokens(OAuthWithAuthorizationCode.java:134)
    at com.microsoft.bingads.internal.OAuthWithAuthorizationCode.refreshTokensIfNeeded(OAuthWithAuthorizationCode.java:166)

Analyzing OAuthWithAuthorizationCode it can be seen that it doesn't set a timeout for the request, which could be a possible solution for this issue.

alegrin commented 5 years ago

Some friendly person forwarded this email sent my Bing:


The issue referenced in the thread reported by few advertisers, was not exactly specific to the Bulk upload status or the symptoms reported on your end. Summarizing the issue from the thread - some clients noticed sporadic or intermittent hang on Java SDK and from the stack trace this was pinned down to the OAuth end point as you see in Github.

      at com.microsoft.bingads.internal.UriOAuthService.getAccessTokens(UriOAuthService.java:66)
                at com.microsoft.bingads.internal.OAuthWithAuthorizationCode.requestAccessAndRefreshTokens(OAuthWithAuthorizationCode.java:134)
                at com.microsoft.bingads.internal.OAuthWithAuthorizationCode.refreshTokensIfNeeded(OAuthWithAuthorizationCode.java:166)

Our Engineering team is fixing this at the moment by introducing a Timeout into the current SDK (ETA 8/9) and as a workaround affected advertisers can implant a timeout in the client library (pushing into OAuthWebAuthCodeGrant by reflection our own WebServiceCaller implementation which set socket timeouts) in order to escape this problem.

qitia commented 5 years ago

Version 12.13.4 has been released today and it should be downloadable since 8/8/2019. Could you please try this version and let me know if it helps? @alegrin

qitia commented 5 years ago

@alegrin just check with you if you still meet the application hang issue after you first reported the issue on 8/2? If you can give us the specific date/time of hang, that will be even helpful. We would like to get this info to help us (Microsoft) track down the root cause. The new SDK release can help avoid application hung but not the root cause of why Microsoft server does not answer.

alegrin commented 5 years ago

@qitia we saw the issue happening a few days after this issue has been reported, but i haven't checked again (its not as easy for us). I've traced the date/time of the hang, and its: August 1st 2019, 17:06 UTC.

Thanks for following up with this, trying to find and fix the root cause.

qitia commented 5 years ago

Thanks @alegrin Our MSA team tell us that they have done upgrade and this timeout should been fixed now. Adding we have timeout suport from SDK, I would close this issue and feel free to reopen it if you have any concern.