Illumina / Cyrius

A tool to genotype CYP2D6 with WGS data
Other
46 stars 5 forks source link

Using Cyrius within other tools/workflow management #37

Closed timsanford closed 9 months ago

timsanford commented 11 months ago

Hello, I am fairly new to coding and I need some help implementing Cyrius in my project. I am currently trying to implement Cyrius in the All of Us cloud-based platform, but I am having trouble. All of Us stores all of its WGS CRAM files in a google bucket. When I copy the CRAMs to my active environment and use the paths to CRAMs in my active environment, I am able to run Cyrius. However, copying CRAMS from the platform takes about 15 minutes per CRAM, so is not feasible for a larger analysis. When I attempt to use the paths to the google bucket location (rather than copying to the active env.) of CRAM files, I receive the feedback that I do not have permissions and Cyrius does not run. I reached out to the All of Us datascience team and they recommended that I use tools such as SAMtools and GATK to work seamlessly with google bucket files. They also recommend using tools in workflows such as dsub, Cromwell, or Nextflow to interact with the google bucket files. Do you know how it would be impossible to incorporate Cyrius into these other tools or workflows, or another way to make Cyrius able to interact with the google bucket files?

Thank you in advance! Tim Sanford

xiao-chen-xc commented 11 months ago

Hi Tim, perhaps you could try using samtool or GATK to extract a small bamlet from the WGS crams in the google bucket to your local environment and then run Cyrius. This might save you some time. This bed file has all the regions needed by Cyrius (https://github.com/Illumina/Cyrius/blob/master/data/CYP2D6_region_38.bed, this is for hg38, other genomes also available).