mrvollger / StainedGlass

Make colorful identity heatmaps of genomic sequence
https://mrvollger.github.io/StainedGlass/
MIT License
98 stars 10 forks source link

samtools sort: failed to change sort order header to 'SO:coordinate' #30

Closed yeeus closed 1 year ago

yeeus commented 1 year ago

when I used stainedglass yesterday, I got an error in rule aln: Error in rule aln: jobid: 13 input: temp/chr2.1000000.10000.ref_0.fasta.mmi, temp/chr2.1000000.1.query.fasta output: temp/chr2.1000000.10000.1.ref_0.bam log: logs/aln.chr2.1000000.10000.1.ref_0.log (check log file(s) for error details) conda-env: /path/stainedglass/chr2/.snakemake/conda/21b3f45a1a39e25e7d843461e7a52e15_ shell: ( minimap2 -t 30 -f 10000 -s 200000 -ax ava-ont --dual=yes --eqx temp/chr2.1000000.10000.ref_0.fasta.mmi temp/chr2.1000000.1.query.fasta | samtools sort -m 4G -o temp/chr2.1000000.10000.1.ref_0.bam ) 2> logs/aln.chr2.1000000.10000.1.ref_0.log (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!) Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message Complete log: .snakemake/log/2023-09-12T003335.513240.snakemake.log

when I looked logs/aln.chr2.1000000.10000.1.ref_0.log, I found this: [M::main::2.058*0.96] loaded/built the index for 243 target sequence(s) [M::mm_mapopt_update::2.058*0.96] mid_occ = 10000 [M::mm_idx_stat] kmer size: 15; skip: 5; is_hpc: 0; #seq: 243 [M::mm_idx_stat::2.464*0.97] distinct minimizers: 45318463 (70.65% are singletons); average occurrences: 1.832; average spacing: 2.923; total length: 242652660 [W::sam_hdr_create] Ignored @SQ SN:chr2:195000000-196000000 : bad or missing LN tag [E::sam_hrecs_update_hashes] Header includes @SQ line "chr2:195000000-196000000" with no LN: tag [E::sam_hrecs_update_hashes] Header includes @SQ line "chr2:195000000-196000000" with no LN: tag samtools sort: failed to change sort order header to 'SO:coordinate'

I guessed it failed due to samtools: minimap2 -t 30 -f 10000 -s 200000 -ax ava-ont --dual=yes --eqx temp/chr2.1000000.10000.ref_0.fasta.mmi temp/chr2.1000000.1.query.fasta | samtools sort -m 4G -o temp/chr2.1000000.10000.1.ref_0.bam where samtools sort needs bam file while the input file a sam

there is my code below: `seqkit grep -f <(echo ${chromosome}) -j50 /path/rawdata/CN1/ref/MF2_mat.v0.8.fasta > CN1_mat.${chromosome}.fasta samtools faidx CN1_mat.${chromosome}.fasta

snakemake --use-conda \
    -s /path/software/StainedGlass-main/workflow/Snakefile \
    --configfile=/path/software/StainedGlass-main/config/config.yaml \
    --config sample=${chromosome} \
             fasta=CN1_mat.${chromosome}.fasta \
             alnthreads=50 \
    --cores all \
    make_figures`

where window=1000000

best wishes!

laramiemckenna commented 9 months ago

@yeeus -- how did you end up solving this? I am experiencing a similar issue.

yeeus commented 9 months ago

@laramiemckenna I edited the code to samtools view -Sb | samtools sort and after trying many times (I think you should have another try) it worked.