allenday / nanostream-dataflow

real-time stream processing of DNA nanopore sequencer reads with dataflow
MIT License
27 stars 9 forks source link

Pipeline still unstable with large .fastqs #72

Open Firedrops opened 5 years ago

Firedrops commented 5 years ago

I have tried increasing the provisioning MACHINE_TYPE to n1-standard-8, which is 8 vCPUs and 30 GB RAM, should be more than enough for any of the reference files.

Large files (>~100 kb?) still get stuck with these error logs. If these appear, the pipeline appears to be unsalvageable and need to be cancelled and restarted.

2019-02-18 (11:33:28) Processing stuck in step Alignment for at least 05m00s without outputting or completing in state pro...

Processing stuck in step Alignment for at least 05m00s without outputting or completing in state process
  at java.net.SocketInputStream.socketRead0(Native Method)
  at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
  at java.net.SocketInputStream.read(SocketInputStream.java:170)
  at java.net.SocketInputStream.read(SocketInputStream.java:141)
  at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
  at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
  at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282)
  at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
  at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
  at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
  at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
  at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165)
  at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
  at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
  at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
  at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
  at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
  at org.apache.http.impl.execchain.ServiceUnavailableRetryExec.execute(ServiceUnavailableRetryExec.java:85)
  at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
  at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
  at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:72)
  at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:221)
  at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:165)
  at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:140)
  at com.theappsolutions.nanostream.util.HttpHelper.executeRequest(HttpHelper.java:105)
  at com.theappsolutions.nanostream.http.NanostreamHttpService.generateAlignData(NanostreamHttpService.java:58)
  at com.theappsolutions.nanostream.aligner.MakeAlignmentViaHttpFn.processElement(MakeAlignmentViaHttpFn.java:49)
  at com.theappsolutions.nanostream.aligner.MakeAlignmentViaHttpFn$DoFnInvoker.invokeProcessElement(Unknown Source)

2019-02-18 (11:38:28) Processing stuck in step Alignment for at least 10m00s without outputting or completing in state pro...

Processing stuck in step Alignment for at least 10m00s without outputting or completing in state process
  at java.net.SocketInputStream.socketRead0(Native Method)
  at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
  at java.net.SocketInputStream.read(SocketInputStream.java:170)
  at java.net.SocketInputStream.read(SocketInputStream.java:141)
  at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
  at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
  at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282)
  at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
  at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
  at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
  at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
  at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165)
  at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
  at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
  at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
  at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
  at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
  at org.apache.http.impl.execchain.ServiceUnavailableRetryExec.execute(ServiceUnavailableRetryExec.java:85)
  at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
  at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
  at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:72)
  at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:221)
  at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:165)
  at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:140)
  at com.theappsolutions.nanostream.util.HttpHelper.executeRequest(HttpHelper.java:105)
  at com.theappsolutions.nanostream.http.NanostreamHttpService.generateAlignData(NanostreamHttpService.java:58)
  at com.theappsolutions.nanostream.aligner.MakeAlignmentViaHttpFn.processElement(MakeAlignmentViaHttpFn.java:49)
  at com.theappsolutions.nanostream.aligner.MakeAlignmentViaHttpFn$DoFnInvoker.invokeProcessElement(Unknown Source)

2019-02-18 (11:38:38) org.apache.http.client.ClientProtocolException: Unexpected response status: 502

org.apache.http.client.ClientProtocolException: Unexpected response status: 502
        com.theappsolutions.nanostream.http.NanostreamResponseHandler.handleResponse(NanostreamResponseHandler.java:39)
        com.theappsolutions.nanostream.http.NanostreamResponseHandler.handleResponse(NanostreamResponseHandler.java:17)
        org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:223)
        org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:165)
        org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:140)
        com.theappsolutions.nanostream.util.HttpHelper.executeRequest(HttpHelper.java:105)
        com.theappsolutions.nanostream.http.NanostreamHttpService.generateAlignData(NanostreamHttpService.java:58)
        com.theappsolutions.nanostream.aligner.MakeAlignmentViaHttpFn.processElement(MakeAlignmentViaHttpFn.java:49)
lachlancoin commented 5 years ago

which file is this, maybe we can split the fastq upfront?

On Mon, 18 Feb 2019 at 11:48, Firedrops notifications@github.com wrote:

I have tried increasing the MACHINE_TYPE to n1-standard-8, which is 8 vCPUs and 30 GB RAM, should be more than enough for any of the reference files.

Large files (~>100 kb?) still get stuck with these error logs:

2019-02-18 (11:33:28) Processing stuck in step Alignment for at least 05m00s without outputting or completing in state pro...

Processing stuck in step Alignment for at least 05m00s without outputting or completing in state process at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:170) at java.net.SocketInputStream.read(SocketInputStream.java:141) at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153) at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185) at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) at org.apache.http.impl.execchain.ServiceUnavailableRetryExec.execute(ServiceUnavailableRetryExec.java:85) at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111) at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:72) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:221) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:165) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:140) at com.theappsolutions.nanostream.util.HttpHelper.executeRequest(HttpHelper.java:105) at com.theappsolutions.nanostream.http.NanostreamHttpService.generateAlignData(NanostreamHttpService.java:58) at com.theappsolutions.nanostream.aligner.MakeAlignmentViaHttpFn.processElement(MakeAlignmentViaHttpFn.java:49) at com.theappsolutions.nanostream.aligner.MakeAlignmentViaHttpFn$DoFnInvoker.invokeProcessElement(Unknown Source)

2019-02-18 (11:38:28) Processing stuck in step Alignment for at least 10m00s without outputting or completing in state pro...

Processing stuck in step Alignment for at least 10m00s without outputting or completing in state process at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:170) at java.net.SocketInputStream.read(SocketInputStream.java:141) at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153) at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185) at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) at org.apache.http.impl.execchain.ServiceUnavailableRetryExec.execute(ServiceUnavailableRetryExec.java:85) at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111) at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:72) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:221) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:165) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:140) at com.theappsolutions.nanostream.util.HttpHelper.executeRequest(HttpHelper.java:105) at com.theappsolutions.nanostream.http.NanostreamHttpService.generateAlignData(NanostreamHttpService.java:58) at com.theappsolutions.nanostream.aligner.MakeAlignmentViaHttpFn.processElement(MakeAlignmentViaHttpFn.java:49) at com.theappsolutions.nanostream.aligner.MakeAlignmentViaHttpFn$DoFnInvoker.invokeProcessElement(Unknown Source)

2019-02-18 (11:38:38) org.apache.http.client.ClientProtocolException: Unexpected response status: 502

org.apache.http.client.ClientProtocolException: Unexpected response status: 502 com.theappsolutions.nanostream.http.NanostreamResponseHandler.handleResponse(NanostreamResponseHandler.java:39) com.theappsolutions.nanostream.http.NanostreamResponseHandler.handleResponse(NanostreamResponseHandler.java:17) org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:223) org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:165) org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:140) com.theappsolutions.nanostream.util.HttpHelper.executeRequest(HttpHelper.java:105) com.theappsolutions.nanostream.http.NanostreamHttpService.generateAlignData(NanostreamHttpService.java:58) com.theappsolutions.nanostream.aligner.MakeAlignmentViaHttpFn.processElement(MakeAlignmentViaHttpFn.java:49)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/allenday/nanostream-dataflow/issues/72, or mute the thread https://github.com/notifications/unsubscribe-auth/AD01ZIsAmzEfSnfYLZUadq7pw2OmZyMQks5vOgZYgaJpZM4a_8Xt .

-- Group leader, Institute for Molecular Bioscience, University of Queensland Senior Lecturer, Imperial College http://academickarma.org/0000-0002-4300-455X http://orcid.org/0000-0002-4300-455X

allenday commented 5 years ago

The last stack trace indicates http 502. You may have flooded the alignment cluster. How many reads are you submitting per batch?

On Mon, Feb 18, 2019, 09:48 Firedrops notifications@github.com wrote:

I have tried increasing the MACHINE_TYPE to n1-standard-8, which is 8 vCPUs and 30 GB RAM, should be more than enough for any of the reference files.

Large files (~>100 kb?) still get stuck with these error logs:

2019-02-18 (11:33:28) Processing stuck in step Alignment for at least 05m00s without outputting or completing in state pro...

Processing stuck in step Alignment for at least 05m00s without outputting or completing in state process at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:170) at java.net.SocketInputStream.read(SocketInputStream.java:141) at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153) at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185) at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) at org.apache.http.impl.execchain.ServiceUnavailableRetryExec.execute(ServiceUnavailableRetryExec.java:85) at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111) at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:72) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:221) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:165) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:140) at com.theappsolutions.nanostream.util.HttpHelper.executeRequest(HttpHelper.java:105) at com.theappsolutions.nanostream.http.NanostreamHttpService.generateAlignData(NanostreamHttpService.java:58) at com.theappsolutions.nanostream.aligner.MakeAlignmentViaHttpFn.processElement(MakeAlignmentViaHttpFn.java:49) at com.theappsolutions.nanostream.aligner.MakeAlignmentViaHttpFn$DoFnInvoker.invokeProcessElement(Unknown Source)

2019-02-18 (11:38:28) Processing stuck in step Alignment for at least 10m00s without outputting or completing in state pro...

Processing stuck in step Alignment for at least 10m00s without outputting or completing in state process at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:170) at java.net.SocketInputStream.read(SocketInputStream.java:141) at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153) at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185) at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) at org.apache.http.impl.execchain.ServiceUnavailableRetryExec.execute(ServiceUnavailableRetryExec.java:85) at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111) at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:72) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:221) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:165) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:140) at com.theappsolutions.nanostream.util.HttpHelper.executeRequest(HttpHelper.java:105) at com.theappsolutions.nanostream.http.NanostreamHttpService.generateAlignData(NanostreamHttpService.java:58) at com.theappsolutions.nanostream.aligner.MakeAlignmentViaHttpFn.processElement(MakeAlignmentViaHttpFn.java:49) at com.theappsolutions.nanostream.aligner.MakeAlignmentViaHttpFn$DoFnInvoker.invokeProcessElement(Unknown Source)

2019-02-18 (11:38:38) org.apache.http.client.ClientProtocolException: Unexpected response status: 502

org.apache.http.client.ClientProtocolException: Unexpected response status: 502 com.theappsolutions.nanostream.http.NanostreamResponseHandler.handleResponse(NanostreamResponseHandler.java:39) com.theappsolutions.nanostream.http.NanostreamResponseHandler.handleResponse(NanostreamResponseHandler.java:17) org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:223) org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:165) org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:140) com.theappsolutions.nanostream.util.HttpHelper.executeRequest(HttpHelper.java:105) com.theappsolutions.nanostream.http.NanostreamHttpService.generateAlignData(NanostreamHttpService.java:58) com.theappsolutions.nanostream.aligner.MakeAlignmentViaHttpFn.processElement(MakeAlignmentViaHttpFn.java:49)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/allenday/nanostream-dataflow/issues/72, or mute the thread https://github.com/notifications/unsubscribe-auth/AAanP0iwBJtc7x5rlRz5B9M4VUHzKvhXks5vOgZYgaJpZM4a_8Xt .

Firedrops commented 5 years ago

The last stack trace indicates http 502. You may have flooded the alignment cluster. How many reads are you submitting per batch?

Just 1. On further testing, it seems the file size is not the main issue. 20170731_GP01_MNP_nohuman.fastq, 866kb, always causes that error. A cassava file, test_Cassava_KE.barcode1_KE.barcode1.fasta, 1,111kb, did not cause the error. Another cassava, test_Cassava_UG.Barcode1_UG.Barcode1.fastq 43,940kb, caused the 5 minutes error.

I'm still further testing, it's a bit slow since it takes the 5 minutes to see this error pop up. For now it looks like big .fasta files are OK, but .fastq files are not.

UPDATE: Testing with another large fastq file also causes the 502 errors, as well as multiples of this:

2019-02-18 (15:16:55) java.net.SocketException: Broken pipe

java.net.SocketException: Broken pipe
        java.net.SocketOutputStream.socketWrite0(Native Method)
        java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
        java.net.SocketOutputStream.write(SocketOutputStream.java:153)
        org.apache.http.impl.io.SessionOutputBufferImpl.streamWrite(SessionOutputBufferImpl.java:124)
        org.apache.http.impl.io.SessionOutputBufferImpl.flushBuffer(SessionOutputBufferImpl.java:136)
        org.apache.http.impl.io.SessionOutputBufferImpl.write(SessionOutputBufferImpl.java:167)
        org.apache.http.impl.io.ContentLengthOutputStream.write(ContentLengthOutputStream.java:113)
        org.apache.http.entity.mime.content.StringBody.writeTo(StringBody.java:174)
        org.apache.http.entity.mime.AbstractMultipartForm.doWriteTo(AbstractMultipartForm.java:134)
        org.apache.http.entity.mime.AbstractMultipartForm.writeTo(AbstractMultipartForm.java:157)
        org.apache.http.entity.mime.MultipartFormEntity.writeTo(MultipartFormEntity.java:113)
        org.apache.http.impl.DefaultBHttpClientConnection.sendRequestEntity(DefaultBHttpClientConnection.java:156)
        org.apache.http.impl.conn.CPoolProxy.sendRequestEntity(CPoolProxy.java:160)
        org.apache.http.protocol.HttpRequestExecutor.doSendRequest(HttpRequestExecutor.java:238)
        org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123)
        org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
        org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
        org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
        org.apache.http.impl.execchain.ServiceUnavailableRetryExec.execute(ServiceUnavailableRetryExec.java:85)
        org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
        org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
        org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:72)
        org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:221)
        org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:165)
        org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:140)
        com.theappsolutions.nanostream.util.HttpHelper.executeRequest(HttpHelper.java:105)
        com.theappsolutions.nanostream.http.NanostreamHttpService.generateAlignData(NanostreamHttpService.java:58)
        com.theappsolutions.nanostream.aligner.MakeAlignmentViaHttpFn.processElement(MakeAlignmentViaHttpFn.java:49)

As @lachlancoin suggested, this might be a batching issue, possibly implemented in a way that works well with .fasta, but not with .fastq??

Firedrops commented 5 years ago

For specifics, I am using the current provisioning script (provision_species.sh), directly calling Allen's bwa-http-docker, and the dataflow command in the README.

The following modifications were made:

  1. change project name to nano-stream1
  2. change pubsub subscription name to ours (dataflow_species)
  3. specify region (asia-northeast-1c)
  4. change firestore names and tokens (mostly in the visualizer app) to ours.
  5. change provisioner machine type to n1-standard-8, it did not appear to help with issues, so I should change it back to n1-standard-4 but not yet.

I wonder if this issue might have been solved previously but not yet committed to the main branch? Most of the commits there are about a week old or more, and these issues have been mentioned in #23 so @obsh and @Pseverin would have known about them for a while.

obsh commented 5 years ago

I think we'll make batch size configurable, to try smaller fastq batches with the aligner. Meanwhile you can try to decrease it in the code and recompile jar file. https://github.com/allenday/nanostream-dataflow/blob/master/NanostreamDataflowMain/src/main/java/com/google/allenday/nanostream/NanostreamApp.java#L54

Also there is a new build of allenday/bwa-http-docker:http container available. It's not a performance improvement, just more correct error handling.

Firedrops commented 5 years ago

I agree, we just ran into the problem again with the EDTA sample. We'll try 100 and maybe 50 tomorrow, it'd be a good idea to pull the batch size out into an argument, since our builds seemed imperfect the last few times.

Firedrops commented 5 years ago

Have tried down to batch size 25, seems to slow down the entire pipeline, no firestore results generated after ~30 mins run time on alignment step. We got the 5 min error in the end and the whole thing had to be cancelled.

Also, it seems that once the 5 min pipeline occurs, the whole provisioning cluster needs to be restarted. If we only restart the dataflow, we would immediately get broken pipe errors:

image

UPDATE: Nevermind, it seems restarting the provisioning cluster doesn't help either. It seems very random, sometimes works sometimes doesn't, even with exact same builds and fastq files. Occasionally also getting 404 errors

image

obsh commented 5 years ago

it'd be a good idea to pull the batch size out into an argument

done now, see optional - --alignmentBatchSize parameter.

Have tried down to batch size 25, seems to slow down the entire pipeline, no firestore results generated after ~30 mins run time on alignment step.

I've experimented with batch size, looks that bigger batch size actually improves performance as in this case bwa starting time adds less overhead. Default value is 2000 as it worked well on "dogbite" dataset in my tests.

I assume that at least n1-highmem-8 machine size is required for aligner when using species reference database. With less memory it seems that OS buffer cache is not working, while withn1-highmem-8 bwa loading time improves significantly on subsequent calls.

Also in #95 we introduced optional --bwaArguments parameter. With default value '-t 4' - bwa now uses 4 threads. For n1-highmem-8 you can try even --bwaArguments='-t 8' for better aligner performance.

lachlancoin commented 5 years ago

I am still having a problem with large fastq, see #98 (connection refused during alignment step). So basically the dataflow stores at the alignment step and nothing comes out of it. This fastq had 4000 records, and I set a batch size of 500 (and using the standard bwa docker). The scripts I use are here: https://github.com/lachlancoin/gcloud/blob/master/init.sh
I set the target-cpu-utliisation to 0.5 (to manage costs!), and use default '-t 4' .

I was wondering, if its possible to avoid the CGI step, which is problematic by instead using Pubsub. I have some thoughts which I will put in a new issue.