cbg-ethz / haploclique

Viral quasispecies assembly via maximal clique finding. A method to reconstruct viral haplotypes and detect large insertions and deletions from NGS data.
GNU General Public License v3.0
25 stars 33 forks source link

SAMTools commands changes cause [bam_sort] fails? #23

Closed daveuu closed 6 years ago

daveuu commented 7 years ago

haplo-clique-assemble seems to complete, generating:

consensus.fasta
statistics.txt
deletions.txt
mean-sd
data_cliques_paired_R1.fastq
data_cliques_paired_R2.fastq
data_cliques_single.fastq
data_clique_to_reads.tsv
singles.prior
alignment.prior
quasispecies.fasta

...but I get the error below during the process so maybe I'm missing some results output? I know SAMTools command usage changed recently ("The obsolete samtools sort in.bam out.prefix usage has been removed. If you are still using ‑f, ‑o, or out.prefix, convert to use -T PREFIX and/or -o FILE instead. (#295, #349, #356, #418, PR #441; see also discussions in #171, #213.)").

[E::hts_open_format] fail to open file 'single_sort.bam'
samtools view: failed to open "single_sort.bam" for reading: No such file or directory
rm: cannot remove 'single_sort.bam': No such file or directory
parallel: Warning: $HOME not set. Using /tmp
When using programs that use GNU Parallel to process data for publication please cite:

  O. Tange (2011): GNU Parallel - The Command-Line Power Tool,
  ;login: The USENIX Magazine, February 2011:42-47.

This helps funding further development; and it won't cost you a cent.
Or you can get GNU Parallel without this requirement by paying 10000 EUR.

To silence this citation notice run 'parallel --bibtex' once or use '--no-notice'.

STATUS: 200
Cliques/Uniques/CPU time:       3/1/0
...
...
Cliques/Uniques/CPU time:       0/0/0
STATUS: 200
Cliques/Uniques/CPU time:       0/0/1
Max read length:                287 bp
[bam_sort] Use -T PREFIX / -o FILE to specify temporary and final output files
Usage: samtools sort [options...] [in.bam]
Options:
  -l INT     Set compression level, from 0 (uncompressed) to 9 (best)
  -m INT     Set maximum memory per thread; suffix K/M/G recognized [768M]
  -n         Sort by read name
  -o FILE    Write final output to FILE rather than standard output
  -T PREFIX  Write temporary files to PREFIX.nnnn.bam
  -@, --threads INT
             Set number of sorting and compression threads [1]
      --input-fmt-option OPT[=VAL]
               Specify a single input file format option in the form
               of OPTION or OPTION=VALUE
  -O, --output-fmt FORMAT[,OPT[=VAL]]...
               Specify output format (SAM, BAM, CRAM)
      --output-fmt-option OPT[=VAL]
               Specify a single output file format option in the form
               of OPTION or OPTION=VALUE
      --reference FILE
               Reference sequence FASTA FILE [null]
rm: cannot remove 'reads.bam': No such file or directory
[bam_sort] Use -T PREFIX / -o FILE to specify temporary and final output files
Usage: samtools sort [options...] [in.bam]
Options: 
  -l INT     Set compression level, from 0 (uncompressed) to 9 (best)
  -m INT     Set maximum memory per thread; suffix K/M/G recognized [768M]
  -n         Sort by read name
  -o FILE    Write final output to FILE rather than standard output
  -T PREFIX  Write temporary files to PREFIX.nnnn.bam
  -@, --threads INT
             Set number of sorting and compression threads [1]
      --input-fmt-option OPT[=VAL]
               Specify a single input file format option in the form
               of OPTION or OPTION=VALUE
  -O, --output-fmt FORMAT[,OPT[=VAL]]...
               Specify output format (SAM, BAM, CRAM)
      --output-fmt-option OPT[=VAL]
               Specify a single output file format option in the form
               of OPTION or OPTION=VALUE
      --reference FILE
               Reference sequence FASTA FILE [null]
mv: cannot stat 'single_2.bam': No such file or directory
daveuu commented 7 years ago

Installing the htslib dependencies like as below seems to fix the problem, but probably better to allow use of up-to-date htslib.

git clone --branch=develop https://github.com/samtools/htslib.git
cd htslib
git checkout 1.2.1
make
cd ..
git clone --branch=develop https://github.com/samtools/samtools.git
cd samtools
git checkout 1.2
make
cd ..
git clone --branch=develop https://github.com/samtools/bcftools.git
cd bcftools
git checkout 1.2
make
cd ..
daveuu commented 7 years ago

...but causes this exception:

Max read length:                287 bp
Exception in thread "main" java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at com.simontuffs.onejar.Boot.run(Boot.java:340)
        at com.simontuffs.onejar.Boot.main(Boot.java:166)
Caused by: java.lang.IllegalArgumentException: Program record with group id bwa-5057A760 already exists in SAMFileHeader!
        at net.sf.samtools.SAMFileHeader.addProgramRecord(SAMFileHeader.java:197)
        at net.sf.samtools.SAMTextHeaderCodec.parsePGLine(SAMTextHeaderCodec.java:154)
        at net.sf.samtools.SAMTextHeaderCodec.decode(SAMTextHeaderCodec.java:94)
        at net.sf.samtools.BAMFileReader.readHeader(BAMFileReader.java:428)
        at net.sf.samtools.BAMFileReader.<init>(BAMFileReader.java:158)
        at net.sf.samtools.BAMFileReader.<init>(BAMFileReader.java:117)
        at net.sf.samtools.SAMFileReader.init(SAMFileReader.java:545)
        at net.sf.samtools.SAMFileReader.<init>(SAMFileReader.java:170)
        at net.sf.samtools.SAMFileReader.<init>(SAMFileReader.java:125)
        at ch.ethz.bsse.cf.utils.Utils.parseBAM(Utils.java:89)
        at ch.ethz.bsse.cf.Startup.parse(Startup.java:125)
        at ch.ethz.bsse.cf.Startup.doMain(Startup.java:159)
        at ch.ethz.bsse.cf.Startup.main(Startup.java:43)
        ... 6 more
armintoepfer commented 7 years ago

Hi, we switched to a new version that simplifies everything. Please give it a try. We are still in the process of testing the correctness.