Closed yaximik closed 6 months ago
It appeared that this problem is a long standing issue:
This broken output is BTW a long-standing mummer bug: see [mummer4/mummer#24](https://github.com/mummer4/mummer/issues/24) and https://github.com/mummer4/mummer/blob/master/src/umd/nucmer_main.cc#L107-L108
Many thanks to John Marshall for help with finding the compliance issue of header formatting in out.sam produced by nucmer, this appeared to be an easy fix:
in code
mummer-4.0.0rc1/src/umd/nucmer_main.cc
change
106 if(args.sam_short_given || args.sam_long_given) {
107 os << "@HD VN1.0 SO:unsorted\n"
108 << "@PG ID:nucmer PN:nucmer VN:4.0 CL:\"" << cmdline << "\"\n";
109 } else {
to
106 if(args.sam_short_given || args.sam_long_given) {
107 os << "@HD\tVN:1.0\tSO:unsorted\n"
108 << "@PG\tID:nucmer\tPN:nucmer\tVN:4.0\tCL:" << cmdline << "\n";
109 } else {
then recompile as recommended. Now out.sam is compatible with all samtool versions.
$ nucmer --maxmatch --threads=4 --sam-short=./Sgor-hum.sam gor-mt.fa rCRS.fa
$ samtools view -H Sgor-hum.sam
@HD VN:1.0 SO:unsorted
@PG ID:nucmer PN:nucmer VN:4.0 CL:"nucmer --maxmatch --threads=4 --sam-short=./Sgor-hum.sam gor-mt.fa rCRS.fa"
Successfully installed mummer-4.0.0rc1 on a Scientific Linux 7.9 box in conda environment. Then run alignment hg38 against gorGor6 ref assembly
$ nucmer -t 16 --sam-short=./Gor6-hg38_short.sam gorGor6.fa hg38.fa
after ~8 hours using ~ 59 GB memory got 761.8 MB sam file. However, further actions with samtools 1.20 failed:
$ samtools view -bT gorGor6.fa Gor6-hg38_short.sam [main_samview] fail to read the header from "Gor6-hg38_short.sam"
$ samtools view -H Gor6-hg38_short.sam [main_samview] fail to read the header from "Gor6-hg38_short.sam"
What could be the problem?