Closed wleepang closed 2 years ago
After a bit of testing, I believe the offending LOC is here: https://github.com/aws/amazon-genomics-cli/blob/a51f57ff1f825aaf06fe7c74409e91ff74dd9061/packages/wes_adapter/amazon_genomics/wes/adapters/NextflowWESAdapter.py#L139
Changing the value to the hard limit value of 10000
would be an easy short term fix, but there may be a workflow that comes along that has more than 10,000 tasks that will need a better solution.
Also worth pointing out that the log entries queried are generated after the workflow is complete. The Nextflow head job container dumps the contents of the .nextflow.log
file as a cleanup step. Therefore, task activity is not available while a workflow is still running, and potentially lost if the Nextflow head job fails and does not perform cleanup.
Attempting to retrieve the logs for a Nextflow workflow while it is still running produces an error:
$ agc logs workflow test-simple-1000 -r 1fe54a99-5431-493c-b7f6-f0af6cffd8fa
2022-03-17T23:43:00Z 𝒊 Showing the logs for 'test-simple-1000'
2022-03-17T23:43:00Z ✘ error="invalid character 'e' looking for beginning of value"
Error: an error occurred invoking 'logs workflow'
with variables: {logsSharedVars:{tail:false contextName: startString: endString: lookBack: filter:} workflowName:test-simple-1000 runId:1fe54a99-5431-493c-b7f6-f0af6cffd8fa taskId: allTasks:false failedTasks:false}
caused by: invalid character 'e' looking for beginning of value
Describe the Bug
The WES adapter and endpoint for contexts that use
nextflow
as an engine only returns details for up to 100 tasks in the WES GetRunLog response. Subsequently,agc logs workflow
output only shows up to 100 tasks.Steps to Reproduce
I created a workflow that generates 1000 tasks in both wdl and nextflow.
wdl workflow
nextflow workflow
I created an agc project for these workflows with the following file structure:
The project configuration is:
Deploy contexts:
Running both workflows:
Retrieving the logs from the workflows after they complete and counting how many lines are returned in the output:
Accounting for the 5 lines that are part of the header in the
agc logs workflow
response, the WDL workflow has all 1000 tasks reportedthe nextflow workflow only has 100 tasks reported
Using
awscurl
to retrieve the WES GetRunLog response directly for theonDemandCtxWdl
context:the above returns
Using
awscurl
to retrieve the WES GetRunLog response directly for theonDemandCtxNextflow
context:the above returns
Relevant Logs
Expected Behavior
.task_logs
if there are more than 100 tasks.logs workflow
output for nextflow workflows show all task instances for the workflow if there are more than 100 tasks.Actual Behavior
logs workflow
for nextflow workflows only return up to 100 tasks instancesScreenshots
Additional Context
Operating System: Linux AGC Version: 1.2.0 Was AGC setup with a custom bucket: No Was AGC setup with a custom VPC: No