Open lgruen opened 3 years ago
To reproduce, ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/illumina_platinum_pedigree/data/CEU/NA12878/alignment/NA12878.alt_bwamem_GRCh38DH.20150706.CEU.illumina_platinum_ped.cram should work.
It may be helpful to mention in the pipelines/broad/reprocessing folder we have a collection of WDLs that take in aligned CRAMs, revert, and then call the single sample genome/exome pipeline. CramToUnmappedBam is available as a top level workflow as well: https://github.com/populationgenomics/warp/tree/main/pipelines/broad/reprocessing/cram_to_unmapped_bams
Also hi, I'm Kylee Degatano, I'm the product manager for the WARP repository and I'm happy to be of help if you ever have questions! My email is kdegatano@broadinstitute.org. We are working on formalizing our contributions guidelines, but we welcome contributions you think may be helpful to the larger community and issues if you find anything wrong or have feature requests!
@kbergin, that's awesome, thanks so much for the tips and pointing to those WDLs! Going to try it out.
And nice to meet you! :)
pipelines/broad/dna_seq/germline/single_sample/wgs/WGSFromBam.wdl
has aninput_cram
parameter, but trying to use a CRAM input results in this error:As a workaround, converting to BAM with samtools first works, but it adds yet another somewhat redundant step before sorting and then converting to fastq.
It's also likely that
Processing.SortSamByName
would currently run out of disk space with a CRAM input:(E.g. a CRAM of 37 GB resulted in a sorted BAM of 117 GB, which wouldn't fit.)