Closed IvantheDugtrio closed 4 years ago
Hi Ivan,
For the S3 permissions issue, are you talking about the bucket provided as the GenomicsS3Bucket parameter?
If yes, well give the following permissions to the AWS Batch compute environment and job definition. Was there a specific permission you found was missing?
Action:
- "s3:GetBucketLocation"
- "s3:ListBucket"
- "s3:ListBucketVersions"
- "s3:GetObject"
- "s3:GetObjectVersion"
- "s3:PutObject"
- "s3:ListMultipartUploadParts"
- "s3:AbortMultipartUpload"
Resource:
- !Sub "arn:aws:s3:::${GenomicsS3Bucket}"
- !Sub "arn:aws:s3:::${GenomicsS3Bucket}/*"
@partha-edico - can you look at this error when executing d_haul and suggest what might be causing it? The json input file is attached in the first comment.
@timestamp @message
2020-04-09 01:35:52.865 TypeError: cannot concatenate 'str' and 'list' objects
2020-04-09 01:35:52.865 self.new_args[self.fastq_list_index] = self.input_dir + filename
2020-04-09 01:35:52.865 " File ""/root/quickstart/dragen_qs.py"" line 288 in download_inputs"
2020-04-09 01:35:52.865 dragen_job.download_inputs()
2020-04-09 01:35:52.865 " File ""/root/quickstart/dragen_qs.py"" line 502 in main"
2020-04-09 01:35:52.865 main()
2020-04-09 01:35:52.865 " File ""/root/quickstart/dragen_qs.py"" line 509 in <module>"
2020-04-09 01:35:52.865 Traceback (most recent call last):
2020-04-09 01:35:52.865 Executing python /root/quickstart/d_haul --mode download --url https://genomic-compute.s3-us-west-2.amazonaws.com/dragen-reference/GRCh37/targeted-regions/DHS-3501Z.GRCh37.roi.bed --path /ephemeral/inputs/
@vsnyc Not sure how this was running before .... but looks like there were a couple of recent changes that were missing in the aws master repo
@partha-edico thanks for the pull request, I'll merge it and run through our tests. Looking at the history, it looks like you may have committed your change to your fork but not issues a pull request against this repo.
As to why it has been working, I had updated the policy in January in commit: https://github.com/aws-quickstart/quickstart-illumina-dragen/commit/e5ad3acbbb2c806554bde4bc39265aca74d0d44c based on an email thread.
Would the code update for package, solve the above issue when using d_haul
script?
Hi Ivan,
For the S3 permissions issue, are you talking about the bucket provided as the GenomicsS3Bucket parameter?
If yes, well give the following permissions to the AWS Batch compute environment and job definition. Was there a specific permission you found was missing?
Action: - "s3:GetBucketLocation" - "s3:ListBucket" - "s3:ListBucketVersions" - "s3:GetObject" - "s3:GetObjectVersion" - "s3:PutObject" - "s3:ListMultipartUploadParts" - "s3:AbortMultipartUpload" Resource: - !Sub "arn:aws:s3:::${GenomicsS3Bucket}" - !Sub "arn:aws:s3:::${GenomicsS3Bucket}/*"
@vsync sorry for the late reply. Thanks for the S3 permissions help, I was able to resolve that. It looks like I'm still getting that python error when running:
Executing python /root/quickstart/d_haul --mode download --url https://genomic-compute.s3-us-west-2.amazonaws.com/dragen-reference/GRCh37/targeted-regions/DHS-3501Z.GRCh37.roi.bed --path /ephemeral/inputs/
@IvantheDugtrio The download_s3_object
method has changed in the last merge commit I did on April 3 in commit 36f53ffb33a93b03daf5062eb011fbaee7962748. Could you confirm you've used that? If not, would it be possible for you to retry with the latest template?
@IvantheDugtrio @vsnyc I think the latest changes could help. But also, please do not put a trailing 'slash' when specifying an S3 path or bucket, i.e. please remove it from "s3://genomic-compute/dragen-reference/GRCh37/" and "s3://genomic-compute/dragen-scratch/Phase-ins/Q275/AWS-F1-040720/",
I don't think the QS script currently strips the trailing '/' which could cause problems when running.
@partha-edico @vsync Thanks for all the help! Do I need to make a new stack in CloudFormation to pull the latest version?
I believe you have to delete the current CF stack and rebuild a new one to be sure, but perhaps Vinod can confirm.
So I made a new stack but now I can't figure out why none of my jobs move from "Runnable" to "Starting". I made the new stack using the same config as before and the same VPC and subnets so I don't understand.
@IvantheDugtrio I wouldn't know specifically what is happening in your case, but from my experience sometimes I have found not enough info on the console when I had problems. In those cases it is helpful for me to use the CLI to query the batch status: https://docs.aws.amazon.com/cli/latest/reference/batch/index.html describe-compute-environments describe-job-definitions describe-job-queues list-jobs describe-jobs
I'm going to close this for now as I have to take on other projects.
I'm running into issues running my json in aws batch based off of the user guide. My first issue was with S3 permissions and I'm still not sure what is causing this as I am using the same AWS account between S3 and Batch. Setting the bucket and all child objects to public gets it working but I don't want to run it like this beyond initial testing. I'm hoping this is just a formatting issue with my json.
Thanks, Ivan De Dios logs-insights-results.txt SS-10436446-MY-022820.json.txt