seqeralabs / nf-aggregate

Pipeline to aggregate pertinent metrics across pipeline runs on the Seqera Platform (beta)
https://seqera.io/
Mozilla Public License 2.0
7 stars 10 forks source link

Connection timed out error when running at scale #59

Open markpanganiban opened 3 months ago

markpanganiban commented 3 months ago

Description of the bug

Using nf-aggregate with run IDs greater than 20 sometimes encounters a connection timeout. I tried to trace the API calls, but they don't seem to be consistently directed only to a specific endpoint.

Aug-01 13:29:26.497 [TaskFinalizer-1] DEBUG nextflow.util.ThreadPoolBuilder - Creating thread pool 'PublishDir' minSize=10; maxSize=36; workQueue=LinkedBlockingQueue[10000]; allowCoreThreadTimeout=false
Aug-01 13:31:29.957 [Actor Thread 8] WARN  nextflow.Nextflow -
Could not get workflow details for workflow 1FJPW8FdT638Ek in workspace Mark/mark-private:
    ↳ Status code null returned from request to https://api.cloud.seqera.io/orgs (authentication headers excluded)

Aug-01 13:31:29.960 [Actor Thread 8] ERROR nextflow.Nextflow - Exception: Connection timed out
wslite.rest.RESTClientException: Connection timed out
        at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77)
        at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499)

Command used and terminal output

1. Create a runids.csv input file with 50 or more run ids.
2. Run the following locally or on Platform.

nextflow run seqeralabs/nf-aggregate --input runids1.csv --outdir ./results -profile docker -w ./work/

### Relevant files

_No response_

### System information

Aug-01 13:29:15.059 [main] DEBUG nextflow.cli.CmdRun - Version: 24.04.2 build 5914 Created: 29-05-2024 06:19 UTC (02:19 EDT) System: Linux 5.15.153.1-microsoft-standard-WSL2 Runtime: Groovy 4.0.21 on OpenJDK 64-Bit Server VM 17.0.6+10 Encoding: UTF-8 (UTF-8) Process: 501537@LAPTOP-CURHNF2E [127.0.1.1] CPUs: 12 - Mem: 15.5 GB (13.6 GB) - Swap: 4 GB (4 GB)



It happens on both local executor and AWS Batch.