brentp / smoove

structural variant calling and genotyping with existing tools, but, smoothly.
Apache License 2.0
222 stars 21 forks source link

Error reading index file from S3 bucket #196

Open james-guevara opened 2 years ago

james-guevara commented 2 years ago

Hi all,

I'm trying to run this command (from inside a docker container):

smoove call --outdir /work/home/results-smoove-chr1/ --exclude /work/home/exclude.cnvnator_100bp.GRCh38.20170403.bed --name SP0184750 --fasta /work/home/GRCh38_full_analysis_set_plus_decoy_hla.fa -p 1 --excludechroms '~^GL,~^HLA,~_random,~^chrUn,~alt,~decoy,chr2,chr3,chr4,chr5,chr6,chr7,chr8,chr9,chr10,chr11,chr12,chr13,chr14,chr15,chr16,chr17,chr18,chr19,chr20,chr21,chr22,chrX,chrY,chrM' --genotype s3://<folder>/<filename.cram>

But I get this error:

[smoove] 2022/04/26 23:08:22 [E::idx_test_and_fetch] 
[smoove] 2022/04/26 23:08:22 Error reading "s3://<folder>/<filename.cram>"
[smoove] 2022/04/26 23:08:22 
panic: signal: segmentation fault (core dumped)

goroutine 1 [running]:
github.com/brentp/smoove/svtyper.check(...)
    /home/brentp/go/go/src/github.com/brentp/smoove/svtyper/svtyper.go:33
github.com/brentp/smoove/svtyper.Svtyper(0xbd6ec0, 0xc0000920f0, 0x7fffc62cec07, 0x35, 0xc00002d810, 0x1, 0x1, 0x7fffc62ceb8e, 0x1f, 0x7fffc62cebf5, ...)
    /home/brentp/go/go/src/github.com/brentp/smoove/svtyper/svtyper.go:159 +0x1818
github.com/brentp/smoove/lumpy.Main()
    /home/brentp/go/go/src/github.com/brentp/smoove/lumpy/lumpy.go:347 +0x44f
main.main()
    /home/brentp/go/go/src/github.com/brentp/smoove/cmd/smoove/smoove.go:121 +0x1ce

So it appears that smoove doesn't have a problem reading the alignment file directly, but it can't fetch the corresponding index file (which is also inside the S3 bucket). I'm wondering if that is the problem. If so, is there a practical solution to handle this? (I'm thinking I can download the index file for this CRAM, and I should be able to specify the location of this index file. It looks like this feature was added in the most recent version of mosdepth.)

Sincerely, James

james-guevara commented 2 years ago

BTW, if I remove the --genotype parameter, it works. (Presumably this has to do with svtyper, and I'm guessing svtyper can't read directly from the S3 bucket because it doesn't use pysam/samtools.)

brentp commented 2 years ago

Hi, I pushed a change for the docker file for this. If you can build and test that, it would be appreciated. otherwise, I'll tag a new release in the next week or so.