quinlan-lab / STRling

Detect novel (and reference) STR expansions from short-read data
MIT License
60 stars 9 forks source link

Checksum mismatch in cram decode #76

Closed nmmsv closed 3 years ago

nmmsv commented 3 years ago

Hi STRling team, I'm trying to run STRling on some cram files, and I get the following error (this is the last part of the log):

[strling] found 3438 STR-like regions in the genome
[strling] got STR repeats from genome into an interval tree
[strling] collecting str-like reads
[strling] extracting chromosome:chr1
[E::cram_decode_slice] MD5 checksum reference mismatch for ref 0 pos 248714585..248753335
[E::cram_decode_slice] CRAM: 75e67a2b43990fd5c419b4180857f756
[E::cram_decode_slice] Ref : 91fd29daa2e0a9ab4422bfed5a28e7e5
[E::cram_next_slice] Failure to decode slice
bam.nim(439)             extract_main
Error: unhandled exception: hts/bam:error in iteration [ValueError]-

and this is the command that I run:

strling extract  \
    sample.cram \
    outs/sample/out \
    -f $REF_GENOME

I checked that I'm using the correct reference genome (samtools view cram -T ref worked fine). Please let me know if I've missed something in my run. Thanks!

hdashnow commented 3 years ago

@brentp have you come across this error?

brentp commented 3 years ago

yes, it usually means the reference genome does not match what was used to create the CRAM file.

brentp commented 3 years ago

can also mean that the CRAM is corrupt, I think.

nmmsv commented 3 years ago

Thanks for your response! Hmm, that's odd because I can samtools view the cram files with the same reference fasta file that I gave to strling. I'm trying to run on the 1000 genomes project cram files. Have you had success running on those? If so, perhaps I could just use the reference genome build that you used. Thanks again for your help. Best, Nima

hdashnow commented 3 years ago

Hi Nima, I've run it on the 2504 high depth 1000 genomes files without issue. Warm regards, Harriet

nmmsv commented 3 years ago

I'll try a few more things on my end and reach out again if I still had issues. Thanks!