My team keeps receiving "Failed to publish" warnings when running our custom Nextflow pipeline. For example:
Jun-17 20:16:08.765 [PublishDir-1982] DEBUG nextflow.processor.PublishDir - Failed to publish file: /home/khushalip/auto-demux/tmp/work/20231220_LH00181_0010_A22FMTGLT3/7b/02f7a9bdae36a94babec6e07e0fe55/bcl_output/CZBMI_NICHE_StrainDropout/StrainDropout_Arrangement1-Plate1-Col3_Media20_P3_R1_LibF18_R2_001.fastq.gz; to: gs://illumina-auto-demux/NovaSeqX/20231220_LH00181_0010_A22FMTGLT3/bcl-convert/CZBMI_NICHE_StrainDropout/StrainDropout_Arrangement1-Plate1-Col3_Media20_P3_R1_LibF18_R2_001.fastq.gz [copy] -- attempt: 1; reason: All 0 retries failed. Waited a total of 0 ms between attempts
Jun-17 20:16:08.765 [PublishDir-2110] DEBUG nextflow.processor.PublishDir - Failed to publish file: /home/khushalip/auto-demux/tmp/work/20231220_LH00181_0010_A22FMTGLT3/7b/02f7a9bdae36a94babec6e07e0fe55/bcl_output/CZBMI_NICHE_StrainDropout/StrainDropout_Arrangement2-Plate1-Col6_Media12_P3_R2_LibK18_R2_001.fastq.gz; to: gs://illumina-auto-demux/NovaSeqX/20231220_LH00181_0010_A22FMTGLT3/bcl-convert/CZBMI_NICHE_StrainDropout/StrainDropout_Arrangement2-Plate1-Col6_Media12_P3_R2_LibK18_R2_001.fastq.gz [copy] -- attempt: 1; reason: All 0 retries failed. Waited a total of 0 ms between attempts
Jun-17 20:16:08.765 [PublishDir-2136] DEBUG nextflow.processor.PublishDir - Failed to publish file: /home/khushalip/auto-demux/tmp/work/20231220_LH00181_0010_A22FMTGLT3/7b/02f7a9bdae36a94babec6e07e0fe55/bcl_output/CZBMI_NICHE_StrainDropout/StrainDropout_Arrangement2-Plate1-Col9_Media12_P3_R1_LibA19_R2_001.fastq.gz; to: gs://illumina-auto-demux/NovaSeqX/20231220_LH00181_0010_A22FMTGLT3/bcl-convert/CZBMI_NICHE_StrainDropout/StrainDropout_Arrangement2-Plate1-Col9_Media12_P3_R1_LibA19_R2_001.fastq.gz [copy] -- attempt: 1; reason: Broken pipe
For v23, the resulting files (published to Google Cloud Storage from a local Linux server) are corrupted (partial files). For v24, Nextflow throws an error prior to completing the publishing process.
This issue only seems to occur during publishing of many large files in parallel (~2-4 Tb).
Expected behavior and actual behavior
See above.
Steps to reproduce the problem
Use publishDir with >2 Tb of files transferred from an Ubuntu server to Google Cloud Storage. This seems to be a bandwidth issue. Note that the server has plenty of resources (128 cores and 750 Gb memory dedicate to just this Nextflow pipeline).
Program output
Jun-17 20:16:08.769 [PublishDir-2140] ERROR nextflow.processor.PublishDir - Failed to publish file: /home/khushalip/auto-demux/tmp/work/20231220_LH00181_0010_A22FMTGLT3/7b/02f7a9bdae36a94babec6e07e0fe55/bcl_output/CZBMI_NICHE_StrainDropout/StrainDropout_Arrangement2-Plate1-Col9_Media19_P3_R2_LibA20_R2_001.fastq.gz; to: gs://illumina-auto-demux/NovaSeqX/20231220_LH00181_0010_A22FMTGLT3/bcl-convert/CZBMI_NICHE_StrainDropout/StrainDropout_Arrangement2-Plate1-Col9_Media19_P3_R2_LibA20_R2_001.fastq.gz [copy] -- See log file for details
java.lang.OutOfMemoryError: Java heap space
Jun-17 20:16:08.768 [PublishDir-1166] ERROR nextflow.processor.PublishDir - Failed to publish file: /home/khushalip/auto-demux/tmp/work/20231220_LH00181_0010_A22FMTGLT3/7b/02f7a9bdae36a94babec6e07e0fe55/bcl_output/CZBMI_NICHE_StrainDropout/StrainDropout_Empty-Plate2-E12_Media20_P3_R1_LibJ24_R1_001.fastq.gz; to: gs://illumina-auto-demux/NovaSeqX/20231220_LH00181_0010_A22FMTGLT3/bcl-convert/CZBMI_NICHE_StrainDropout/StrainDropout_Empty-Plate2-E12_Media20_P3_R1_LibJ24_R1_001.fastq.gz [copy] -- See log file for details
java.lang.OutOfMemoryError: Java heap space
Jun-17 20:16:08.772 [PublishDir-2101] ERROR nextflow.processor.PublishDir - Failed to publish file: /home/khushalip/auto-demux/tmp/work/20231220_LH00181_0010_A22FMTGLT3/7b/02f7a9bdae36a94babec6e07e0fe55/bcl_output/CZBMI_NICHE_StrainDropout/StrainDropout_Arrangement2-Plate1-Col5_Media12_P3_R2_LibI18_R2_001.fastq.gz; to: gs://illumina-auto-demux/NovaSeqX/20231220_LH00181_0010_A22FMTGLT3/bcl-convert/CZBMI_NICHE_StrainDropout/StrainDropout_Arrangement2-Plate1-Col5_Media12_P3_R2_LibI18_R2_001.fastq.gz [copy] -- See log file for details
java.lang.OutOfMemoryError: Java heap space
at java.base/sun.nio.ch.Net.localAddress(Net.java:625)
at java.base/sun.nio.ch.NioSocketImpl.endConnect(NioSocketImpl.java:529)
at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:604)
at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:327)
at java.base/java.net.Socket.connect(Socket.java:751)
at java.base/sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:304)
at java.base/sun.net.NetworkClient.doConnect(NetworkClient.java:178)
at java.base/sun.net.www.http.HttpClient.openServer(HttpClient.java:531)
at java.base/sun.net.www.http.HttpClient.openServer(HttpClient.java:636)
at java.base/sun.net.www.protocol.https.HttpsClient.<init>(HttpsClient.java:264)
at java.base/sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:377)
at java.base/sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:193)
at java.base/sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1237)
at java.base/sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1123)
at java.base/sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:179)
at java.base/sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:141)
at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:151)
at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:84)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1012)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:525)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:466)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:576)
at com.google.cloud.storage.spi.v1.HttpStorageRpc.get(HttpStorageRpc.java:509)
at com.google.cloud.storage.StorageImpl.lambda$get$6(StorageImpl.java:285)
at com.google.cloud.storage.StorageImpl$$Lambda/0x00007f78f84925b0.call(Unknown Source)
at com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:103)
at com.google.cloud.RetryHelper.run(RetryHelper.java:76)
at com.google.cloud.RetryHelper.runWithRetries(RetryHelper.java:50)
at com.google.cloud.storage.Retrying.run(Retrying.java:54)
at com.google.cloud.storage.StorageImpl.run(StorageImpl.java:1406)
at com.google.cloud.storage.StorageImpl.get(StorageImpl.java:284)
at com.google.cloud.storage.StorageImpl.get(StorageImpl.java:290)
Jun-17 20:16:08.768 [PublishDir-758] ERROR nextflow.processor.PublishDir - Failed to publish file: /home/khushalip/auto-demux/tmp/work/20231220_LH00181_0010_A22FMTGLT3/7b/02f7a9bdae36a94babec6e07e0fe55/bcl_output/CZBMI_NICHE_StrainDropout/StrainDropout_89-Desulfovibrio-piger-ATCC-29098_Media20_P3_R3_LibN9_R1_001.fastq.gz; to: gs://illumina-auto-demux/NovaSeqX/20231220_LH00181_0010_A22FMTGLT3/bcl-convert/CZBMI_NICHE_StrainDropout/StrainDropout_89-Desulfovibrio-piger-ATCC-29098_Media20_P3_R3_LibN9_R1_001.fastq.gz [copy] -- See log file for details
java.lang.OutOfMemoryError: Java heap space
Jun-17 20:16:08.768 [PublishDir-2080] ERROR nextflow.processor.PublishDir - Failed to publish file: /home/khushalip/auto-demux/tmp/work/20231220_LH00181_0010_A22FMTGLT3/7b/02f7a9bdae36a94babec6e07e0fe55/bcl_output/CZBMI_NICHE_StrainDropout/StrainDropout_Arrangement2-Plate1-Col2_Media20_P3_R2_LibC18_R2_001.fastq.gz; to: gs://illumina-auto-demux/NovaSeqX/20231220_LH00181_0010_A22FMTGLT3/bcl-convert/CZBMI_NICHE_StrainDropout/StrainDropout_Arrangement2-Plate1-Col2_Media20_P3_R2_LibC18_R2_001.fastq.gz [copy] -- See log file for details
java.lang.OutOfMemoryError: Java heap space
Jun-17 20:16:08.770 [PublishDir-1927] ERROR nextflow.processor.PublishDir - Failed to publish file: /home/khushalip/auto-demux/tmp/work/20231220_LH00181_0010_A22FMTGLT3/7b/02f7a9bdae36a94babec6e07e0fe55/bcl_output/CZBMI_NICHE_StrainDropout/StrainDropout_91-Akkermansia-muciniphila-ATCC-BAA-835_Media19_P3_R2_LibG6_R2_001.fastq.gz; to: gs://illumina-auto-demux/NovaSeqX/20231220_LH00181_0010_A22FMTGLT3/bcl-convert/CZBMI_NICHE_StrainDropout/StrainDropout_91-Akkermansia-muciniphila-ATCC-BAA-835_Media19_P3_R2_LibG6_R2_001.fastq.gz [copy] -- See log file for details
java.lang.OutOfMemoryError: Java heap space
at java.base/java.util.Arrays.copyOf(Arrays.java:3541)
at com.google.cloud.BaseWriteChannel.write(BaseWriteChannel.java:135)
at com.google.cloud.storage.contrib.nio.CloudStorageWriteChannel.write(CloudStorageWriteChannel.java:68)
at java.base/sun.nio.ch.FileChannelImpl.transferToArbitraryChannel(FileChannelImpl.java:729)
at java.base/sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:787)
at java.base/sun.nio.ch.ChannelInputStream.transfer(ChannelInputStream.java:283)
at java.base/sun.nio.ch.ChannelInputStream.transferTo(ChannelInputStream.java:250)
at java.base/java.nio.file.Files.copy(Files.java:3151)
at nextflow.file.CopyMoveHelper.copyFile(CopyMoveHelper.java:91)
at nextflow.file.CopyMoveHelper.copyToForeignTarget(CopyMoveHelper.java:172)
at nextflow.file.FileHelper.copyPath(FileHelper.groovy:962)
at nextflow.processor.PublishDir.processFileImpl(PublishDir.groovy:508)
at nextflow.processor.PublishDir.processFile(PublishDir.groovy:421)
at java.base/java.lang.invoke.LambdaForm$DMH/0x00007f78f84a9000.invokeVirtual(LambdaForm$DMH)
at java.base/java.lang.invoke.LambdaForm$MH/0x00007f78f874cc00.invoke(LambdaForm$MH)
at java.base/java.lang.invoke.LambdaForm$MH/0x00007f78f8192400.invokeExact_MT(LambdaForm$MH)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invokeImpl(DirectMethodHandleAccessor.java:155)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:343)
at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:328)
at groovy.lang.MetaClassImpl.doInvokeMethod(MetaClassImpl.java:1333)
at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1088)
at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1007)
at org.codehaus.groovy.runtime.InvokerHelper.invokePogoMethod(InvokerHelper.java:645)
at org.codehaus.groovy.runtime.InvokerHelper.invokeMethod(InvokerHelper.java:628)
at org.codehaus.groovy.runtime.InvokerHelper.invokeMethodSafe(InvokerHelper.java:82)
at nextflow.processor.PublishDir$_retryableProcessFile_closure2.doCall(PublishDir.groovy:398)
at java.base/java.lang.invoke.DirectMethodHandle$Holder.invokeSpecial(DirectMethodHandle$Holder)
at java.base/java.lang.invoke.LambdaForm$MH/0x00007f78f874c400.invoke(LambdaForm$MH)
at java.base/java.lang.invoke.Invokers$Holder.invokeExact_MT(Invokers$Holder)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invokeImpl(DirectMethodHandleAccessor.java:154)
Jun-17 20:16:08.791 [PublishDir-1905] DEBUG nextflow.processor.PublishDir - Failed to publish file: /home/khushalip/auto-demux/tmp/work/20231220_LH00181_0010_A22FMTGLT3/7b/02f7a9bdae36a94babec6e07e0fe55/bcl_output/CZBMI_NICHE_StrainDropout/StrainDropout_89-Desulfovibrio-piger-ATCC-29098_Media19_P3_R1_LibM9_R2_001.fastq.gz; to: gs://illumina-auto-demux/NovaSeqX/20231220_LH00181_0010_A22FMTGLT3/bcl-convert/CZBMI_NICHE_StrainDropout/StrainDropout_89-Desulfovibrio-piger-ATCC-29098_Media19_P3_R1_LibM9_R2_001.fastq.gz [copy] -- attempt: 1; reason: All 0 retries failed. Waited a total of 0 ms between attempts
Jun-17 20:16:08.808 [PublishDir-2158] DEBUG nextflow.processor.PublishDir - Failed to publish file: /home/khushalip/auto-demux/tmp/work/20231220_LH00181_0010_A22FMTGLT3/7b/02f7a9bdae36a94babec6e07e0fe55/bcl_output/CZBMI_NICHE_StrainDropout/StrainDropout_Arrangement2-Plate2-Row11_Media19_P3_R2_LibE24_R2_001.fastq.gz; to: gs://illumina-auto-demux/NovaSeqX/20231220_LH00181_0010_A22FMTGLT3/bcl-convert/CZBMI_NICHE_StrainDropout/StrainDropout_Arrangement2-Plate2-Row11_Media19_P3_R2_LibE24_R2_001.fastq.gz [copy] -- attempt: 1; reason: All 0 retries failed. Waited a total of 0 ms between attempts
Environment
Nextflow version: 23 and 24.04.2
Java version: openjdk 21.0.0
Operating system: Ubuntu
Bash version: 5.1.16
Additional context
I tried to initially post on Slack, but did not receive an answer.
Bug report
My team keeps receiving "Failed to publish" warnings when running our custom Nextflow pipeline. For example:
For v23, the resulting files (published to Google Cloud Storage from a local Linux server) are corrupted (partial files). For v24, Nextflow throws an error prior to completing the publishing process. This issue only seems to occur during publishing of many large files in parallel (~2-4 Tb).
Expected behavior and actual behavior
See above.
Steps to reproduce the problem
Use
publishDir
with >2 Tb of files transferred from an Ubuntu server to Google Cloud Storage. This seems to be a bandwidth issue. Note that the server has plenty of resources (128 cores and 750 Gb memory dedicate to just this Nextflow pipeline).Program output
Environment
Additional context
I tried to initially post on Slack, but did not receive an answer.