nf-core / quantms

Quantitative mass spectrometry workflow. Currently supports proteomics experiments with complex experimental designs for DDA-LFQ, DDA-Isobaric and DIA-LFQ quantification.
https://nf-co.re/quantms
MIT License
31 stars 0 forks source link

Do not use nextflow readLine() since it downloads files on cluster head nodes #61

Closed harper357 closed 7 months ago

harper357 commented 2 years ago

Description of the bug

Im getting an odd error when trying to run 40 samples through the pipeline on AWS batch.

Everything proceeds normally until the MZMLINDEXING step when the head node crashes, with the error java error “Failed to acquire stream chunk”.

The log where the error happens:

| 2022-10-05T11:20:19.906-07:00 | [51/a3778a] Submitted process > NFCORE_QUANTMS:QUANTMS:FILE_PREPARATION:MZMLINDEXING (file_32)
  | 2022-10-05T11:20:23.240-07:00 | Failed to acquire stream chunk
  | 2022-10-05T11:20:23.240-07:00 | -- Check script '/root/.nextflow/assets/[users_name]/nf-quantms/./workflows/../subworkflows/local/file_preparation.nf' at line: 32 or see '.nextflow.log' file for more details
  | 2022-10-05T11:20:23.263-07:00 | -[nf-core/quantms] Pipeline completed with errors-
  | 2022-10-05T11:20:23.267-07:00 | WARN: Killing running tasks (39)
  | 2022-10-05T11:20:23.469-07:00CopyWARN: Unable to get file attributes file: s3://[users_bucket]/versions.yml -- Cause: com.amazonaws.SdkClientException: Failed to sanitize XML document destined for handler class com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListBucketHandler | WARN: Unable to get file attributes file: s3://[users_bucket]/_nextflow/runs/39/df0b79873b070c19eddd53c33b8288/versions.yml -- Cause: com.amazonaws.SdkClientException: Failed to sanitize XML document destined for handler class com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListBucketHandler
  | 2022-10-05T11:20:27.115-07:00 | Failed to acquire stream chunk
  | 2022-10-05T11:20:39.624-07:00 | === Running Cleanup ===

The code refenced:

https://github.com/nf-core/quantms/blob/a1bf7d4104ec424abff984512764ddecde79d21f/subworkflows/local/file_preparation.nf#L30-L34

The head node seems to crash at a different file number if I change the amount of memory I assign to the head node. Is all the data passing through the head node somewhere? I've never had this problem with any of my NGS pipelines, they use more and larger files, so I am a little confused at this crash.

Command used and terminal output

No response

Relevant files

The log file where it crashes: (it is dated different, but this is the same error that always shows)

Oct-10 23:03:07.744 [Actor Thread 15] ERROR nextflow.extension.DataflowHelper - @unknown
java.io.IOException: Failed to acquire stream chunk
    at com.upplication.s3fs.ng.FutureInputStream.nextBuffer(FutureInputStream.java:78)
    at com.upplication.s3fs.ng.FutureInputStream.read(FutureInputStream.java:63)
    at java.base/sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:270)
    at java.base/sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:313)
    at java.base/sun.nio.cs.StreamDecoder.read(StreamDecoder.java:188)
    at java.base/java.io.InputStreamReader.read(InputStreamReader.java:177)
    at java.base/java.io.BufferedReader.fill(BufferedReader.java:162)
    at java.base/java.io.BufferedReader.readLine(BufferedReader.java:329)
    at java.base/java.io.BufferedReader.readLine(BufferedReader.java:396)
    at java_io_BufferedReader$readLine.call(Unknown Source)
    at Script_d4bc0d6a$_runScript_closure1$_closure2$_closure5$_closure8$_closure9$_closure11.doCall(Script_d4bc0d6a:32)
    at jdk.internal.reflect.GeneratedMethodAccessor276.invoke(Unknown Source)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:107)
    at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323)
    at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:274)
    at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1035)
    at groovy.lang.Closure.call(Closure.java:412)
    at groovy.lang.Closure.call(Closure.java:428)
    at org.codehaus.groovy.runtime.DefaultGroovyMethods.upto(DefaultGroovyMethods.java:16406)
    at org.codehaus.groovy.runtime.dgm$875.doMethodInvoke(Unknown Source)
    at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1268)
    at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1035)
    at groovy.runtime.metaclass.NumberDelegatingMetaClass.invokeMethod(NumberDelegatingMetaClass.java:60)
    at org.codehaus.groovy.runtime.callsite.PojoMetaClassSite.call(PojoMetaClassSite.java:44)
    at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:148)
    at Script_d4bc0d6a$_runScript_closure1$_closure2$_closure5$_closure8$_closure9.doCall(Script_d4bc0d6a:31)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:107)
    at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323)
    at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:274)
    at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1035)
    at groovy.lang.Closure.call(Closure.java:412)
    at groovy.lang.Closure.call(Closure.java:428)
    at org.codehaus.groovy.runtime.IOGroovyMethods.withReader(IOGroovyMethods.java:1160)
    at org.apache.groovy.nio.extensions.NioExtensions.withReader(NioExtensions.java:1434)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at org.codehaus.groovy.runtime.metaclass.ReflectionMetaMethod.invoke(ReflectionMetaMethod.java:54)
    at org.codehaus.groovy.runtime.metaclass.NewInstanceMetaMethod.invoke(NewInstanceMetaMethod.java:54)
    at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323)
    at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1268)
    at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1035)
    at groovy.runtime.metaclass.NextflowDelegatingMetaClass.invokeMethod(NextflowDelegatingMetaClass.java:66)
    at org.codehaus.groovy.runtime.callsite.PojoMetaClassSite.call(PojoMetaClassSite.java:44)
    at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:139)
    at Script_d4bc0d6a$_runScript_closure1$_closure2$_closure5$_closure8.doCall(Script_d4bc0d6a:30)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:107)
    at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323)
    at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:274)
    at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1035)
    at groovy.lang.Closure.call(Closure.java:412)
    at groovy.lang.Closure.call(Closure.java:428)
    at nextflow.extension.BranchOp.doNext(BranchOp.groovy:55)
    at jdk.internal.reflect.GeneratedMethodAccessor258.invoke(Unknown Source)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:107)
    at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323)
    at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1268)
    at groovy.lang.MetaClassImpl.invokeMethodClosure(MetaClassImpl.java:1048)
    at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1142)
    at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1035)
    at groovy.lang.Closure.call(Closure.java:412)
    at groovy.lang.Closure.call(Closure.java:428)
    at groovy.lang.Closure$call.call(Unknown Source)
    at nextflow.extension.DataflowHelper$_subscribeImpl_closure2.doCall(DataflowHelper.groovy:285)
    at jdk.internal.reflect.GeneratedMethodAccessor202.invoke(Unknown Source)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:107)
    at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323)
    at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:274)
    at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1035)
    at groovy.lang.Closure.call(Closure.java:412)
    at groovyx.gpars.dataflow.operator.DataflowOperatorActor.startTask(DataflowOperatorActor.java:120)
    at groovyx.gpars.dataflow.operator.DataflowOperatorActor.onMessage(DataflowOperatorActor.java:108)
    at groovyx.gpars.actor.impl.SDAClosure$1.call(SDAClosure.java:43)
    at groovyx.gpars.actor.AbstractLoopingActor.runEnhancedWithoutRepliesOnMessages(AbstractLoopingActor.java:293)
    at groovyx.gpars.actor.AbstractLoopingActor.access$400(AbstractLoopingActor.java:30)
    at groovyx.gpars.actor.AbstractLoopingActor$1.handleMessage(AbstractLoopingActor.java:93)
    at groovyx.gpars.util.AsyncMessagingCore.run(AsyncMessagingCore.java:132)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
    at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Cannot reserve 10,485,760 bytes of direct buffer memory (allocated: 1070363393, limit: 1,073,741,824)
    at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122)
    at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:191)
    at com.upplication.s3fs.ng.FutureInputStream.nextBuffer(FutureInputStream.java:75)
    ... 94 common frames omitted
Caused by: java.lang.OutOfMemoryError: Cannot reserve 10485760 bytes of direct buffer memory (allocated: 1070363393, limit: 1073741824)
    at java.base/java.nio.Bits.reserveMemory(Bits.java:178)
    at java.base/java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:121)
    at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:332)
    at com.upplication.s3fs.ng.ChunkBuffer.<init>(ChunkBuffer.java:41)
    at com.upplication.s3fs.ng.ChunkBufferFactory.create(ChunkBufferFactory.java:65)
    at com.upplication.s3fs.ng.S3ParallelDownload.doDownload(S3ParallelDownload.java:136)
    at com.upplication.s3fs.ng.S3ParallelDownload.lambda$safeDownload$1(S3ParallelDownload.java:127)
    at dev.failsafe.Functions.lambda$toCtxSupplier$11(Functions.java:236)
    at dev.failsafe.Functions.lambda$get$0(Functions.java:46)
    at dev.failsafe.internal.RetryPolicyExecutor.lambda$apply$0(RetryPolicyExecutor.java:75)
    at dev.failsafe.SyncExecutionImpl.executeSync(SyncExecutionImpl.java:176)
    at dev.failsafe.FailsafeExecutor.call(FailsafeExecutor.java:437)
    at dev.failsafe.FailsafeExecutor.get(FailsafeExecutor.java:115)
    at com.upplication.s3fs.ng.S3ParallelDownload.safeDownload(S3ParallelDownload.java:127)
    at com.upplication.s3fs.ng.FutureIterator.lambda$init$0(FutureIterator.java:59)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    ... 3 common frames omitted
Oct-10 23:03:07.773 [Actor Thread 15] DEBUG nextflow.Session - Session aborted -- Cause: Failed to acquire stream chunk

System information

Nextflow version (eg. 22.04.5) Hardware AWS Executor awsbatch Container engine: default OS AWSLinux Version of nf-core/quantms v1.1dev

jpfeuffer commented 2 years ago

Hmm now, looking at the line, this check might actually be performed by nextflow itself and therefore on the head node. But I kind of thought that reading the first X lines should be doable.

ypriverol commented 7 months ago

This has been open for more than a year now.

jpfeuffer commented 7 months ago

We don't index by default anymore and therefore do not check the files anymore. If you provide raw files or indexedMzml, this should not happen anymore.