awslabs / genomics-secondary-analysis-using-aws-step-functions-and-aws-batch

This solution provides a framework for Next Generation Sequencing (NGS) genomics secondary-analysis pipelines using AWS Step Functions and AWS Batch.
https://aws.amazon.com/solutions/implementations/genomics-secondary-analysis-using-aws-step-functions-and-aws-batch/
Apache License 2.0
38 stars 22 forks source link

Error building when modifying the solution #6

Open spitfiredd opened 3 years ago

spitfiredd commented 3 years ago

I can get this to run unmodified; however, I made a few modifications:

  1. Added additional docker images (tested locally and these build correctly) - also if I don't delete on stack failure these images are present.
  2. added additional batch jobs for docker images
  3. removed sections of the code the upload the sample data.

I updated the policy for the sample bucket to :

      Policies:
        - PolicyName: S3Access
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action:
                  - s3:ListBucket
                  - s3:GetObject
                  - s3:GetObjectVersion
                  - s3:PutObject
                Resource:
                  - !Sub arn:aws:s3:::${JobResultsBucket}
                  - !Sub arn:aws:s3:::${JobResultsBucket}/*
              - Effect: Allow
                Action:
                  - s3:ListBucket
                  - s3:GetObject
                Resource:
                  - arn:aws:s3:::my_data_folder
                  - arn:aws:s3:::my_data_folder/*
                  # - !Sub arn:aws:s3:::${SamplesBucket}
                  # - !Sub arn:aws:s3:::${SamplesBucket}/*

I get the following error when building and I am unclear what it means or how to debug it.

2021-09-07T08:53:14.924-07:00   ++ get-repo-url GenomicsWorkflowPipe
2021-09-07T08:53:14.924-07:00   ++ local stack_name=GenomicsWorkflowPipe
2021-09-07T08:53:14.924-07:00   +++ aws cloudformation describe-stacks --stack-name GenomicsWorkflowPipe --query 'Stacks[].Outputs[?OutputKey==`RepoCloneUrl`].OutputValue' --output text
2021-09-07T08:53:14.924-07:00   ++ local url=https://git-codecommit.us-west-1.amazonaws.com/v1/repos/GenomicsWorkflowCode
2021-09-07T08:53:14.924-07:00   ++ echo https://git-codecommit.us-west-1.amazonaws.com/v1/repos/GenomicsWorkflowCode
2021-09-07T08:53:14.924-07:00   + git remote add origin https://git-codecommit.us-west-1.amazonaws.com/v1/repos/GenomicsWorkflowCode
2021-09-07T08:53:14.924-07:00   + git push -u origin master
2021-09-07T08:53:18.981-07:00   To https://git-codecommit.us-west-1.amazonaws.com/v1/repos/GenomicsWorkflowCode
2021-09-07T08:53:18.981-07:00   * [new branch] master -> master
2021-09-07T08:53:18.981-07:00   Branch 'master' set up to track remote branch 'master' from 'origin'.
2021-09-07T08:53:18.981-07:00   + wait-for-stack GenomicsWorkflowCode
2021-09-07T08:53:18.981-07:00   + local stack_name=GenomicsWorkflowCode
2021-09-07T08:53:18.981-07:00   + local exists_attempts=6
2021-09-07T08:53:18.981-07:00   + local status=0
2021-09-07T08:53:18.981-07:00   + set +e
2021-09-07T08:53:18.981-07:00   + echo 'Creating stack: GenomicsWorkflowCode'
2021-09-07T08:53:18.981-07:00   Creating stack: GenomicsWorkflowCode
2021-09-07T08:53:18.981-07:00   + (( attempt=1 ))
2021-09-07T08:53:18.981-07:00   + (( attempt<=6 ))
2021-09-07T08:53:18.981-07:00   + echo 'Waiting for stack creation - attempt: 1'
2021-09-07T08:53:18.981-07:00   Waiting for stack creation - attempt: 1
2021-09-07T08:53:18.981-07:00   + aws cloudformation wait stack-exists --stack-name GenomicsWorkflowCode
2021-09-07T08:54:55.034-07:00   
2021-09-07T08:54:55.034-07:00   Waiter StackExists failed: Max attempts exceeded. Previously accepted state: Matched expected service error code: ValidationError
2021-09-07T08:54:55.034-07:00   + status=255
2021-09-07T08:54:55.034-07:00   + '[' 255 -eq 0 ']'
2021-09-07T08:54:55.034-07:00   + (( attempt++ ))
2021-09-07T08:54:55.034-07:00   + (( attempt<=6 ))
2021-09-07T08:54:55.034-07:00   + echo 'Waiting for stack creation - attempt: 2'
2021-09-07T08:54:55.034-07:00   Waiting for stack creation - attempt: 2
2021-09-07T08:54:55.034-07:00   + aws cloudformation wait stack-exists --stack-name GenomicsWorkflowCode
2021-09-07T08:56:31.112-07:00   
2021-09-07T08:56:31.112-07:00   Waiter StackExists failed: Max attempts exceeded. Previously accepted state: Matched expected service error code: ValidationError
2021-09-07T08:56:31.112-07:00   + status=255
2021-09-07T08:56:31.112-07:00   + '[' 255 -eq 0 ']'
2021-09-07T08:56:31.112-07:00   + (( attempt++ ))
2021-09-07T08:56:31.112-07:00   + (( attempt<=6 ))
2021-09-07T08:56:31.112-07:00   + echo 'Waiting for stack creation - attempt: 3'
2021-09-07T08:56:31.112-07:00   Waiting for stack creation - attempt: 3
2021-09-07T08:56:31.112-07:00   + aws cloudformation wait stack-exists --stack-name GenomicsWorkflowCode
2021-09-07T08:58:07.196-07:00   
2021-09-07T08:58:07.196-07:00   Waiter StackExists failed: Max attempts exceeded. Previously accepted state: Matched expected service error code: ValidationError
2021-09-07T08:58:07.196-07:00   + status=255
2021-09-07T08:58:07.196-07:00   + '[' 255 -eq 0 ']'
2021-09-07T08:58:07.196-07:00   + (( attempt++ ))
2021-09-07T08:58:07.196-07:00   + (( attempt<=6 ))
2021-09-07T08:58:07.196-07:00   + echo 'Waiting for stack creation - attempt: 4'
2021-09-07T08:58:07.196-07:00   Waiting for stack creation - attempt: 4
2021-09-07T08:58:07.196-07:00   + aws cloudformation wait stack-exists --stack-name GenomicsWorkflowCode
2021-09-07T08:59:43.284-07:00   
2021-09-07T08:59:43.284-07:00   Waiter StackExists failed: Max attempts exceeded. Previously accepted state: Matched expected service error code: ValidationError
2021-09-07T08:59:43.284-07:00   + status=255
2021-09-07T08:59:43.284-07:00   + '[' 255 -eq 0 ']'
2021-09-07T08:59:43.284-07:00   + (( attempt++ ))
2021-09-07T08:59:43.284-07:00   + (( attempt<=6 ))
2021-09-07T08:59:43.284-07:00   + echo 'Waiting for stack creation - attempt: 5'
2021-09-07T08:59:43.284-07:00   Waiting for stack creation - attempt: 5
2021-09-07T08:59:43.284-07:00   + aws cloudformation wait stack-exists --stack-name GenomicsWorkflowCode
2021-09-07T09:01:21.383-07:00   
2021-09-07T09:01:21.383-07:00   Waiter StackExists failed: Max attempts exceeded. Previously accepted state: Matched expected service error code: ValidationError
2021-09-07T09:01:21.383-07:00   + status=255
2021-09-07T09:01:21.383-07:00   + '[' 255 -eq 0 ']'
2021-09-07T09:01:21.383-07:00   + (( attempt++ ))
2021-09-07T09:01:21.383-07:00   + (( attempt<=6 ))
2021-09-07T09:01:21.383-07:00   + echo 'Waiting for stack creation - attempt: 6'
2021-09-07T09:01:21.383-07:00   Waiting for stack creation - attempt: 6
2021-09-07T09:01:21.383-07:00   + aws cloudformation wait stack-exists --stack-name GenomicsWorkflowCode
2021-09-07T09:02:58.238-07:00   
2021-09-07T09:02:58.238-07:00   Waiter StackExists failed: Max attempts exceeded. Previously accepted state: Matched expected service error code: ValidationError
2021-09-07T09:02:58.238-07:00   + status=255
2021-09-07T09:02:58.238-07:00   + '[' 255 -eq 0 ']'
2021-09-07T09:02:58.238-07:00   + (( attempt++ ))
2021-09-07T09:02:58.238-07:00   + (( attempt<=6 ))
2021-09-07T09:02:58.238-07:00   + '[' '!' 255 -eq 0 ']'
2021-09-07T09:02:58.238-07:00   + echo '[ERROR] Stack creation could not be started.'
2021-09-07T09:02:58.238-07:00   [ERROR] Stack creation could not be started.
2021-09-07T09:02:58.238-07:00   + return 255
2021-09-07T09:02:58.238-07:00   + status=255
2021-09-07T09:02:58.238-07:00   + set -e
2021-09-07T09:02:58.238-07:00   + '[' '!' 255 -eq 0 ']'
2021-09-07T09:02:58.238-07:00   + echo '[ERROR] GenomicsWorkflowCode Stack FAILED'
2021-09-07T09:02:58.238-07:00   [ERROR] GenomicsWorkflowCode Stack FAILED
2021-09-07T09:02:58.238-07:00   + exit 255
2021-09-07T09:02:58.238-07:00   
2021-09-07T09:02:58.238-07:00   [Container] 2021/09/07 16:02:56 Command did not exit successfully ./setup/$SOLUTION_ACTION.sh exit status 255
2021-09-07T09:02:58.238-07:00   [Container] 2021/09/07 16:02:56 Phase complete: INSTALL State: FAILED
                               [Container] 2021/09/07 16:02:56 Phase context status code: COMMAND_EXECUTION_ERROR Message: Error while executing command: ./setup/$SOLUTION_ACTION.sh. 
                               Reason: exit status 255
spitfiredd commented 3 years ago

Upon digging through the logs I found:

denied: User: arn:aws:sts::<ACCOUNTID>:assumed-role/DataQualityWorkflowsPipe-IamRoles-JC-CodeBuildRole-27UMBE2B38IO/AWSCodeBuild-5f5cca70-b5d1-4072-abac-ab48b3d387ed is not authorized to perform: ecr:CompleteLayerUpload on resource: arn:aws:ecr:us-west-1:<ACCOUNTID>:repository/dataqualityworkflows-spades

I've added 5 tools, fastp, fastqc, megahit, spades and bbtools and the other will push to ECR but spades will not; and I am not sure why? Any assistance would be grateful.

Here are the sections of the yaml files I create.

StackBuildContainerSpades:
  Type: "AWS::CloudFormation::Stack"
  Properties:
    Parameters:
      Project: !Ref ProjectLowerCase
      ImageName: spades
      ImageTag: "3.15.3"
      BuildSpec: ./containers/buildspec.yml
      ProjectPath: ./containers/spades
      CodeBuildRoleArn: !GetAtt IamRoles.Outputs.CodeBuildRoleArn
      UseProjectPrefix: "yes"
    TemplateURL:
      Fn::Sub:
        - ${TemplateRootUrl}/container-buildproject.cfn.yaml
        - TemplateRootUrl:
            Fn::Sub:
              - "https://${ZoneBucket}.s3.${AWS::Region}.amazonaws.com/zone"
              - ZoneBucket:
                  Fn::ImportValue:
                    !Sub ${ZoneStackName}-ZoneBucket
...
- Name: Spades
  ActionTypeId:
    Category: Build
    Owner: AWS
    Provider: CodeBuild
    Version: "1"
  Configuration:
    ProjectName: !GetAtt "StackBuildContainerSpades.Outputs.Name"
  InputArtifacts:
  - Name: SourceStageOutput
sachalau commented 3 years ago

Hey Daniel,

I'm not the developer of this solution but I think that the developers did not planed that you use their solution that way.

See issue: https://github.com/awslabs/genomics-secondary-analysis-using-aws-step-functions-and-aws-batch/issues/2

Am I right that you are trying to modify directly the files that are present in this repo ?

Here is how I added my private ECR images and how I think the developer would rather do:

Deploy the stacks using the files provided in this repo, without modification, that I think you managed.

Then you will have in your CodeCommit two repos: "Code" and "Pipe". You should clone these repos and make your own customizations there.

"Pipe" is the repo that deploys the resources that are necessary for "Code" to operate. If you have a look into CodePipeline, you have the "CodePipeline" that for the moment only builds the code and the Docker images defined in the vanila project. So you must modify these so that your new Docker images are built.

To do so you modify main.cfn.yaml in "Pipe" CodeCommit and that's where you add your "StackBuildContainerSpades". Then at the end of the same file you modify the code pipeline so that you include the new stack in the build phase. Once pushed you will see that the CodePipeline now has the unbuilt Spades block in the build phase.

Now you need to add a new folder in the "Code" repo: containers/spades/ and write the Dockerfile there. If everything is in order, next time the Pipeline "Code" will run, this file will be read and the spades container built into ECR. As this is use case is already planed in the vanilla project, you should not need to modify any IAM role. (all ecr rights are already included in the CodeBuildSeviceRole of the "Pipe" repo). In the main.cfn.yaml, you will have to define the Batch job definition based on the spades container however.

At least that's how I managed to build my own custumized solution and I think was the intended use. I hope this is more or less clear.

spitfiredd commented 3 years ago

@sachalau

First off thank you so much, I believe I am now on the right path!

However, I am now running into an issue where the new docker containers are not being built and if I trigger them manually by clicking Start Build from the web UI I get the following error:

Build failed to start. The following error occurred: ArtifactsOverride must be set when using artifacts type CodePipelines

Error from Web UI

sachalau commented 3 years ago

I think you can't build the images from CodeBuild because you have defined an artifact that must come from CodePipelines. Can you push a change to your "Code" CodeCommit" or release a change to the "Pipe" CodePipeline tools ?

spitfiredd commented 3 years ago

@sachalau - I don't think I am following. I followed the PFD guide and first updated the GenomicsWorkflowPipe repo, I modified main.cfn.yml like I have shown above by added StackBuildContainerSpades and then under the Codepipeline section added a new section for Spades.

I am not sure what to do next.

sachalau commented 3 years ago

Yep. Now if you go to the codepipeline "pipe" you should see in the build s stage the steps for building the docker images you added. However as you have not run the codepipeline "pipe" since you added them, they should appear as grey "did not run". To run this pipeline, you must either push a change to the repo "code" or in the UI, click release change.

Le mer. 8 sept. 2021 à 19:31, Daniel Donovan @.***> a écrit :

@sachalau https://github.com/sachalau - I don't think I am following. I followed the PFD guide and first updated the GenomicsWorkflowPipe repo, I modified main.cfn.yml like I have shown above by added StackBuildContainerSpades and then under the Codepipeline section added a new section for Spades.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/awslabs/genomics-secondary-analysis-using-aws-step-functions-and-aws-batch/issues/6#issuecomment-915432574, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD347NJIBLX7R7OKWYKWRJDUA6MWHANCNFSM5DSYTJOA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.