data-yaml / vivos

Versioned Interoperability to Velocitize Open Science
Apache License 2.0
0 stars 0 forks source link

2024-01-10 demo #23

Closed drernie closed 8 months ago

drernie commented 8 months ago

End to End: Raw Data to Analysis

Sequence

  1. FastQ Files
  2. AWS Workspace
  3. Storage Gateway
  4. Auto-Packager
  5. NF Tower [Omics]
  6. Email URI [can go to Slack]

NOTE: public website, do not mention any customer names

drernie commented 8 months ago
  1. Build package
  2. Use package name
  3. Use package URIs
  4. Synthesize samplesheet
  5. Get completion emails
drernie commented 8 months ago
  1. Can run Nextflow + nf-quilt w/SALES credentials
  2. Can NOT read Quilt+ URIs for samplesheet or inputs
  3. FAIL: using vivo-staging as scratch disk
  4. FAIL: http input and s3 output/!?
ERROR ~ Unknown method invocation `multiply` on Integer type
-- Check 'nf-3VMrm7uIc5HIBT.log' file for details
ERROR ~ publish failed:
-- Check 'nf-3VMrm7uIc5HIBT.log' file for details
FAILED: QuiltPackage.vivos_production_tower_hlatyping

Get an error on Tower, and publish fails. But run Succeeds. And no LOG.

drernie commented 8 months ago

Jan-10 00:35:26.879 [main] ERROR nextflow.script.WorkflowMetadata - Failed to invoke workflow.onComplete event handler java.io.IOException: Cannot run program "mail": error=2, No such file or directory

THIS was due to providing an email address (Forge probably lacked the correct permission)

drernie commented 8 months ago

s3://qulit-demos WORKS https://demo.quiltdata.com/b/quilt-demos/tree/direct/hlatyping/

When I try it with vivo-pipes, I insanely get: `Error executing process > 'NFCORE_HLATYPING:HLATYPING:INPUT_CHECK:SAMPLESHEET_CHECK (samplesheet_full.csv)'

Which is probably due to: at nextflow.proces [nf-4JZDgKLIZL7JQA.log](https://github.com/data-yaml/vivos/files/13882729/nf-4JZDgKLIZL7JQA.log) sor.PublishDir.createPublishDir(PublishDir.groovy:471)

drernie commented 8 months ago
Jan-10 03:39:45.881 [Task monitor] DEBUG nextflow.file.FileHelper - Creating a file system instance for provider: S3FileSystemProvider
Jan-10 03:39:45.882 [Task monitor] DEBUG nextflow.cloud.aws.config.AwsConfig - AWS S3 config properties: {upload_chunk_size=10485760, max_error_retry=10}
Jan-10 03:39:45.885 [Task monitor] DEBUG nextflow.cloud.aws.nio.S3Client - Setting S3 upload chunk size=10485760
Jan-10 03:39:45.885 [Task monitor] DEBUG nextflow.cloud.aws.nio.S3Client - Setting S3 glacierRetrievalTier=null
Jan-10 03:39:46.023 [Task monitor] DEBUG nextflow.processor.TaskProcessor - Handling unexpected condition for
  task: name=NFCORE_HLATYPING:HLATYPING:INPUT_CHECK:SAMPLESHEET_CHECK (samplesheet_full.csv); work-dir=s3://quilt-demos/scratch/4JZDgKLIZL7JQA/73/def5d7fb0cdc50f01f9428cfcaccba
  error [com.amazonaws.services.s3.model.AmazonS3Exception]: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: YN19P6RK3AQ5G8YA; S3 Extended Request ID: gxIWFIIh2ouRqoa53HJrSeukFul07DmsSL+rmpJQzN+oH7L0ZA2YpSz6j+AhPreU93aJ9v2MRn+fcTXfQTlj319hrsGqdZJmFMboOkislYw=; Proxy: null)
Jan-10 03:39:46.053 [Task monitor] ERROR nextflow.processor.TaskProcessor - Error executing process > 'NFCORE_HLATYPING:HLATYPING:INPUT_CHECK:SAMPLESHEET_CHECK (samplesheet_full.csv)'

Caused by:
  Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: YN19P6RK3AQ5G8YA; S3 Extended Request ID: gxIWFIIh2ouRqoa53HJrSeukFul07DmsSL+rmpJQzN+oH7L0ZA2YpSz6j+AhPreU93aJ9v2MRn+fcTXfQTlj319hrsGqdZJmFMboOkislYw=; Proxy: null)

com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: YN19P6RK3AQ5G8YA; S3 Extended Request ID: gxIWFIIh2ouRqoa53HJrSeukFul07DmsSL+rmpJQzN+oH7L0ZA2YpSz6j+AhPreU93aJ9v2MRn+fcTXfQTlj319hrsGqdZJmFMboOkislYw=; Proxy: null)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1879)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1418)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1387)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1157)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:814)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541)
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5456)
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5403)
    at com.amazonaws.services.s3.AmazonS3Client.access$300(AmazonS3Client.java:421)
    at com.amazonaws.services.s3.AmazonS3Client$PutObjectStrategy.invokeServiceCall(AmazonS3Client.java:6532)
    at com.amazonaws.services.s3.AmazonS3Client.uploadObject(AmazonS3Client.java:1861)
    at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1821)
    at nextflow.cloud.aws.nio.S3Client.putObject(S3Client.java:210)
    at nextflow.cloud.aws.nio.S3FileSystemProvider.createDirectory(S3FileSystemProvider.java:488)
    at java.base/java.nio.file.Files.createDirectory(Files.java:700)
    at java.base/java.nio.file.Files.createAndCheckIsDirectory(Files.java:807)
    at java.base/java.nio.file.Files.createDirectories(Files.java:753)
    at nextflow.processor.PublishDir.makeDirs(PublishDir.groovy:480)
    at nextflow.processor.PublishDir.createPublishDir(PublishDir.groovy:471)
    at nextflow.processor.PublishDir.apply0(PublishDir.groovy:209)
    at nextflow.processor.PublishDir.apply(PublishDir.groovy:286)
    at nextflow.processor.PublishDir$apply.call(Unknown Source)
    at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
    at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
    at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:148)
    at nextflow.processor.TaskProcessor.publishOutputs0(TaskProcessor.groovy:1362)
    at nextflow.processor.TaskProcessor.publishOutputs(TaskProcessor.groovy:1337)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at org.codehaus.groovy.runtime.callsite.PlainObjectMetaMethodSite.doInvoke(PlainObjectMetaMethodSite.java:48)
    at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite$PogoCachedMethodSiteNoUnwrapNoCoerce.invoke(PogoMetaMethodSite.java:189)
    at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite.callCurrent(PogoMetaMethodSite.java:57)
    at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCallCurrent(CallSiteArray.java:51)
    at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:171)
    at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:185)
    at nextflow.processor.TaskProcessor.finalizeTask0(TaskProcessor.groovy:2337)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at org.codehaus.groovy.runtime.callsite.PlainObjectMetaMethodSite.doInvoke(PlainObjectMetaMethodSite.java:48)
    at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite$PogoCachedMethodSiteNoUnwrapNoCoerce.invoke(PogoMetaMethodSite.java:189)
    at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite.callCurrent(PogoMetaMethodSite.java:57)
    at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCallCurrent(CallSiteArray.java:51)
    at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:171)
    at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:185)
    at nextflow.processor.TaskProcessor.finalizeTask(TaskProcessor.groovy:2308)
    at nextflow.processor.TaskPollingMonitor.checkTaskStatus(TaskPollingMonitor.groovy:631)
    at nextflow.processor.TaskPollingMonitor.checkAllTasks(TaskPollingMonitor.groovy:537)
    at nextflow.processor.TaskPollingMonitor.pollLoop(TaskPollingMonitor.groovy:412)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:107)
    at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323)
    at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1254)
    at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1030)
    at org.codehaus.groovy.runtime.InvokerHelper.invokePogoMethod(InvokerHelper.java:1036)
    at org.codehaus.groovy.runtime.InvokerHelper.invokeMethod(InvokerHelper.java:1019)
    at org.codehaus.groovy.runtime.InvokerHelper.invokeMethodSafe(InvokerHelper.java:97)
    at nextflow.processor.TaskPollingMonitor$_start_closure2.doCall(TaskPollingMonitor.groovy:293)
    at nextflow.processor.TaskPollingMonitor$_start_closure2.call(TaskPollingMonitor.groovy)
    at groovy.lang.Closure.run(Closure.java:498)
    at java.base/java.lang.Thread.run(Thread.java:833)
Jan-10 03:39:46.068 [Task monitor] DEBUG nextflow.Session - Session aborted -- Cause: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: YN19P6RK3AQ5G8YA; S3 Extended Request ID: gxIWFIIh2ouRqoa53HJrSeukFul07DmsSL+rmpJQzN+oH7L0ZA2YpSz6j+AhPreU93aJ9v2MRn+fcTXfQTlj319hrsGqdZJmFMboOkislYw=; Proxy: null)
drernie commented 8 months ago

AH. CDK BUCKETS are created "Deny ALL" Principals.

drernie commented 8 months ago

Making them public, AND granting hackathon-shared principal access does NOT help. Trying DRIFT: manually deleting 'Deny' to see if that solves the problem.

NOPE: Even manually deleting the DENY does NOT allow direct S3 access. WTF?

Plan B: Enable sales-production bucket?!?

ALSO: quilt-demos has proper permissions, but STILL fails on nf-plugin: "ERROR ~ Unknown method invocation multiply on Integer type"

drernie commented 8 months ago

AH. That's a ~known bug with task-attempt scale checks:

cpus = { check_max( 1 * task.attempt, 'cpus' ) }
memory = { check_max( 6.GB * task.attempt, 'memory' ) }
time = { check_max( 4.h * task.attempt, 'time' ) }

versus

    cpus = 20
    memory = 72.GB
    time = 96.h

DOH. Is this because I used test_full (which uses base) versus [test](https://github.com/nf-core/hlatyping/blob/master/conf/test.config) profile (which overrides)?

drernie commented 8 months ago

So.... getting rid of test_full removes the error. Now it runs without error: Then nothing.

Jan-10 04:49:23.196 [PublishDir-3] DEBUG nextflow.quilt.QuiltObserver - onFilePublish.Path[quilt-demos#package=direct%2fhlatyping&path=multiqc%2fmultiqc_plots]
Jan-10 04:49:23.196 [PublishDir-3] DEBUG nextflow.quilt.QuiltObserver - checkPath[quilt-demos#package=direct%2fhlatyping&path=multiqc%2fmultiqc_plots] published[true]
Jan-10 04:49:23.196 [main] DEBUG nextflow.util.ThreadPoolManager - Thread pool 'PublishDir' shutdown completed (hard=false)
Jan-10 04:49:23.198 [main] INFO  nextflow.Nextflow - -[nf-core/hlatyping] Pipeline completed successfully-
Jan-10 04:49:23.204 [main] DEBUG n.trace.WorkflowStatsObserver - Workflow completed > WorkflowStats[succeededCount=17; failedCount=0; ignoredCount=0; cachedCount=0; pendingCount=0; submittedCount=0; runningCount=0; retriesCount=0; abortedCount=0; succeedDuration=11m 52s; failedDuration=0ms; cachedDuration=0ms;loadCpus=0; loadMemory=0; peakRunning=4; peakCpus=20; peakMemory=120 GB; ]
Jan-10 04:49:23.205 [main] DEBUG nextflow.trace.TraceFileObserver - Workflow completed -- saving trace file
Jan-10 04:49:23.207 [main] DEBUG nextflow.trace.ReportObserver - Workflow completed -- rendering execution report
Jan-10 04:49:24.385 [main] DEBUG nextflow.trace.TimelineObserver - Workflow completed -- rendering execution timeline
Jan-10 04:50:26.840 [tower-logs-checkpoint] DEBUG n.cloud.aws.nio.S3FileSystemProvider - S3 upload file from=nf-4xbsNXg1LhGfdf.txt to=s3://quilt-demos/scratch/4xbsNXg1LhGfdf/nf-4xbsNXg1LhGfdf.txt
Jan-10 04:50:27.078 [tower-logs-checkpoint] DEBUG n.cloud.aws.nio.S3FileSystemProvider - S3 upload file from=nf-4xbsNXg1LhGfdf.log to=s3://quilt-demos/scratch/4xbsNXg1LhGfdf/nf-4xbsNXg1LhGfdf.log

I hate my life...