SynBioDex / libSBOLj

Java Library for Synthetic Biology Open Language (SBOL)
Apache License 2.0
37 stars 24 forks source link

org.synbiohub.frontend.SynBioHubFrontend should include timeouts #559

Closed jakebeal closed 5 years ago

jakebeal commented 6 years ago

Unless specifically disabled, org.synbiohub.frontend.SynBioHubFrontend should include an HTTP timeout in the configuration of the HTTP connection.

Currently when fetching items from SynBioHub, the SynBioHubFrontend class performs a blocking read inside of org.synbiohub.frontend.SynBioHubFrontend.fetchContentAsInputStream. Normally, this is not a problem, but when a read dies in the middle (as can happen with very large pulls on occasion), this leaves the thread hung with no exception.

This issue is currently causing freezes in the SD2 Dictionary Maintainer. It can be kludged around by implementing a timeout on SynBioHubFrontend, but should properly be addressed within the library by setting the timeout on the HTTP operation instead.

jakebeal commented 6 years ago

Additional diagnostic information. A read that normally completes in a couple of minutes hung for multiple hours. Stack inspection found the following trace, identifying the issue:

"main" #1 prio=5 os_prio=0 tid=0x00007f5e8400a800 nid=0x1e4 runnable [0x00007f5e8cb9e000]
   java.lang.Thread.State: RUNNABLE
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
        at java.net.SocketInputStream.read(SocketInputStream.java:171)
        at java.net.SocketInputStream.read(SocketInputStream.java:141)
        at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
        at sun.security.ssl.InputRecord.read(InputRecord.java:503)
        at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:983)
        - locked <0x00000005eb58cb98> (a java.lang.Object)
        at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:940)
        at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
        - locked <0x00000005eb58bdd8> (a sun.security.ssl.AppInputStream)
        at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
        at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
        at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282)
        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
        at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
        at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
        at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165)
        at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
        at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
        at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
        at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
        at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
        at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
        at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
        at org.synbiohub.frontend.SynBioHubFrontend.fetchContentAsInputStream(SynBioHubFrontend.java:1106)
        at org.synbiohub.frontend.SynBioHubFrontend.fetchFromSynBioHub(SynBioHubFrontend.java:978)
        at org.synbiohub.frontend.SynBioHubFrontend.getSBOL(SynBioHubFrontend.java:129)
        at com.bbn.sd2.SynBioHubAccessor.retrieve(SynBioHubAccessor.java:161)
        at com.bbn.sd2.MaintainDictionary.update_entry(MaintainDictionary.java:226)
        at com.bbn.sd2.MaintainDictionary.maintain_dictionary(MaintainDictionary.java:318)
        at com.bbn.sd2.DictionaryMaintainerApp.main(DictionaryMaintainerApp.java:31)
cjmyers commented 6 years ago

Discussing this with @zachzundel, he thinks we should implement the solution on the SynBioHub side. Namely, that the HTTP requests should initiate the request, and return as quickly as possible. SBH should cache the result of the fetch and the frontend would be able to repeat the request eventually getting the document back once the fetch has concluded. This will require changes on both SBH and libSBOLj, but it seems this will be less overhead and be more robust in the long run. @zachzundel please add your thoughts and also open the SBH issue for this too.

jakebeal commented 6 years ago

Let me run two scenarios past you:

  1. I'm a user downloading a large file over a terrible connection, and I'm happy to let the download run all night while I am sleeping.
  2. I'm a tool downloading a large file as part of an automated batch process, and I want to either finish in 2 minutes or move on to the next.

I don't see any way for SynBioHub to distinguish between these two clients. Thus, I think that it's going to be important to let the client also be able to set their own abort. And if libSBOLj doesn't do it, then we're probably just going to write a library function around the library that will abort the library instead.

cjmyers commented 6 years ago

I think what we are suggesting will address your concerns. The idea is that SBH would not be blocking. Namely, you send a request then it would respond that it got your request. You could probe the same URL to see when the request has been completed. The library then could have a blocking method that essentially busy waits polling for the request to be processed, as well as another method which takes a time limit and polls until the time limit has been exceeded.

jakebeal commented 6 years ago

That doesn't address the case of a very slow download.

3ach commented 6 years ago

I think we can still add a timeout to the SynBioHubFrontend, but SynBioHub itself will be more asynchronous, like Chris described.

cjmyers commented 6 years ago

Very slow download is usually due to the time it takes to create the SBOL file. Certainly, if we get SBOL files in the GBs, then we may see it to be a transfer problem, but right now it is usually not the bottleneck. Even so, I think a timeout that I propose will allow you to say how long you are willing to wait for the download, and this may be file construction and/or download time.

cjmyers commented 5 years ago

I’ve added a new constructor for SynBioHubFrontend which takes a timeout in seconds:

public SynBioHubFrontend(String backendUrl,int timeout)

Please test.

On Aug 14, 2018, at 6:30 PM, Jacob Beal notifications@github.com wrote:

Unless specifically disabled, org.synbiohub.frontend.SynBioHubFrontend should include an HTTP timeout in the configuration of the HTTP connection.

Currently when fetching items from SynBioHub, the SynBioHubFrontend class performs a blocking read inside of org.synbiohub.frontend.SynBioHubFrontend.fetchContentAsInputStream. Normally, this is not a problem, but when a read dies in the middle (as can happen with very large pulls on occasion), this leaves the thread hung with no exception.

This issue is currently causing freezes in the SD2 Dictionary Maintainer. It can be kludged around by implementing a timeout on SynBioHubFrontend, but should properly be addressed within the library by setting the timeout on the HTTP operation instead.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/SynBioDex/libSBOLj/issues/559, or mute the thread https://github.com/notifications/unsubscribe-auth/ADWD96IffoWL0MkLm3IsPkb8W36vQgy7ks5uQ2uYgaJpZM4V9YGx.

jakebeal commented 5 years ago

@bbartley Can you please test this with the SD2 dictionary?

jakebeal commented 5 years ago

@dsumorok-raytheon has incorporated use of this in the SD2 dictionary, so I think it can be counted as tested and resolved.