awslabs / genomics-secondary-analysis-using-aws-step-functions-and-aws-batch

This solution provides a framework for Next Generation Sequencing (NGS) genomics secondary-analysis pipelines using AWS Step Functions and AWS Batch.
https://aws.amazon.com/solutions/implementations/genomics-secondary-analysis-using-aws-step-functions-and-aws-batch/
Apache License 2.0
38 stars 22 forks source link

Issues when executing build-s3-dist.sh #1

Closed ury closed 4 years ago

ury commented 4 years ago

Hi, A couple of issues when running build-s3-dist.sh (running from Ubuntu on AWS Cloud9):

pip install -t . crhelper returns an error: DistutilsOptionError: can't combine user with prefix, exec_prefix/home, or install_(plat)base Following SO answers by changing it to pip install --user --install-option="--prefix=" crhelper resolved it.

Also, some of the commands in the script produce output which I'm not sure is expected:

copy yaml templates and rename
cp: cannot stat '/home/ubuntu/environment/genomics-secondary-analysis-using-aws-step-functions-and-aws-batch/deployment/*.yaml': No such file or directory
mv: cannot stat '*.yaml': No such file or directory
Updating code source bucket in template with XXXX-pipeline-distribution
sed -i '' -e s/%%BUCKET_NAME%%/XXXX-pipeline-distribution/g /home/ubuntu/environment/genomics-secondary-analysis-using-aws-step-functions-and-aws-batch/deployment/global-s3-assets/*.template
sed: can't read : No such file or directory
sed -i '' -e s/%%SOLUTION_NAME%%/XXXX-Pipeline/g /home/ubuntu/environment/genomics-secondary-analysis-using-aws-step-functions-and-aws-batch/deployment/global-s3-assets/*.template
sed: can't read : No such file or directory
sed -i '' -e s/%%VERSION%%/1.0/g /home/ubuntu/environment/genomics-secondary-analysis-using-aws-step-functions-and-aws-batch/deployment/global-s3-assets/*.template
sed: can't read : No such file or directory

Finally (and maybe related to the above), there's no ./dist folder to upload to S3 when the script is done.

Thanks in advance, Ury

wleepang commented 4 years ago

@ury -

These are a couple of variances due to differences in Linux distributions and MacOS.

The crhelper install doesn't throw a traceback when using Amazon Linux 2 for a Cloud9 instance. Even with the traceback under Ubuntu, the crhelper package still gets installed into the desired target path:

./source/setup/lambda

You can safely ignore the sed errors. They are due to the -i '' option provided, which is necessary with the version of sed on MacOS, but not with sed on LInux.

All assets get built into two folders:

  1. ./deployment/global-s3-assets: contains the main solution installation template and required asset bundles.
  2. ./deployment/regional-s3-assets: contains only the installation asset bundles and sample data used for the solution.

These match with the process that AWS Solutions uses to enable global deployment of solutions.

ury commented 4 years ago

Thanks Lee, I proceeded per your clarifications. There are a couple of instructions in README.md that I wasn't sure about:

aws s3 cp ./dist/ s3://my-bucket-name-/$SOLUTION_NAME/$VERSION/ --recursive --acl bucket-owner-full-control --profile aws-cred-profile-name

./dist - which folder is that? I assumed it refers to the deployment folder and used it when uploading to S3.

And then:

Deploy the solution to your account by launching a new AWS CloudFormation stack using the link of the solution template in Amazon S3.

you clarified that the template resides in the ./deployment/global-s3-assets folder, so I used the link for the template in that folder

When I create the stack, I get the following error: Error occurred while GetObject. S3 Error Code: NoSuchKey. S3 Error Message: The specified key does not exist. (Service: AWSLambdaInternal; Status Code: 400; Error Code: InvalidParameterValueException; Request ID: 41067856-0c25-4845-99ed-4516f64a0b1d)

In the template, I understand that the following file is expected: [bucket_name]-[region_name]/[solution_name]/[version]/SetupLambdaBundle.zip, and in my case the file resides in [bucket_name]-[region_name]/[solution_name]/[version]/global-s3-assets/SetupLambdaBundle.zip.

To match the above, I uploaded only the global-s3-assets folder to S3 and launched the stack creation process again. This time it lasted longer, but failed during the Setup resource creation:

Failed to create resource. Code Build job 'GenomicsWorkflow-Setup:c82b0974-a7a4-4031-a4be-1b343e176964' in project 'GenomicsWorkflow-Setup' exited with a build status of 'FAILED'.

At this point, I think I should wait for your insights regarding the above, as I did at least one thing wrong.

wleepang commented 4 years ago

@ury -

I'm curious about your use case here. Are you attempting to install the solution as is, or have you customized it and wanting to deploy the customized assets?

ury commented 4 years ago

Hi Lee, At this point, I'm attempting to install it as-is.

wleepang commented 4 years ago

Ok, in that case it's better to use the "Launch" button found on this page: https://aws.amazon.com/solutions/genomics-secondary-analysis-using-aws-step-functions-and-aws-batch/?did=sl_card&trk=sl_card

The assets are already deployed globally, allowing you to install the solution in any supported region.

The README instructions are if you want to deploy a customized version.

On Fri, May 22, 2020, 11:24 PM Ury Alon notifications@github.com wrote:

Hi Lee, At this point, I'm attempting to install it as-is.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/awslabs/genomics-secondary-analysis-using-aws-step-functions-and-aws-batch/issues/1#issuecomment-632994029, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACYIZXJ2MONHHOO4THNG5DRS5TZLANCNFSM4M6EKE3A .

wleepang commented 4 years ago

Closing this issue for now. Feel free to re-open if needed.