jts / nanopore-paper-analysis

Code for nanopore paper
33 stars 5 forks source link

Makefile target not being generated #6

Open ParsaAkbari opened 8 years ago

ParsaAkbari commented 8 years ago

The following lines in step 5 of the makefile:

# index the draft assembly for bwa
draft_genome.fasta.bwt: raw.reads.np.fasta
        bwa index $<

# index the draft assembly for faidx
draft_genome.fasta.fai: draft_genome.fasta.bwt
       samtools faidx $<

bwa index $< is supposed to produce draft_genome.fasta.bwt, but instead this command produces raw.reads.np.fasta.bwt. This then causes an error because draft_genome.fasta.bwt is not present for the following target.

jts commented 8 years ago

Hi @ParsaAkbari,

It looks OK to me? https://github.com/jts/nanopore-paper-analysis/blob/master/full-pipeline.make#L193

Jared

ParsaAkbari commented 8 years ago

I am not an expert on makefiles, so please correct me if I am wrong, this is my understanding so far:

The part (or 'rule') in the makefile I am referring to is, this line is important because (according to the layout [3] of the make file it creates the file draft_genome.fasta.bwt):

draft_genome.fasta.bwt: raw.reads.np.fasta
     bwa index $<

$< in a make file expands to the 'first prerequisite' of the target, in this case that would be 'raw.reads.np.fasta'. Therefore bwa index $< will expand to the following command which will be run by the makefile: bwa index raw.reads.np.fasta [1].

The issue here is that the command 'bwa index raw.reads.np.fasta' will produce the following files [2]:

raw.reads.np.fasta.bwt
raw.reads.np.fasta.pac
raw.reads.np.fasta.ann
raw.reads.np.fasta.amb
raw.reads.np.fasta.sa

So the file: draft_genome.fasta.bwt has not been created, and the next bit of the makefile which requires draft_genome.fasta.fai is not able to run

 draft_genome.fasta.fai: draft_genome.fasta.bwt
            samtools faidx $<

Therefore a link in the pipeline is broken and the following steps cannot run and the polished genome is not produced. Perhaps something is behaving differently on my system? Can't quite figure out what this issue would be as I have been dilligent to use the same versions of bwa and other software as those specified in the makefile.

[1] http://www.gnu.org/software/make/manual/make.html#Recipes_002fSearch [2] If you want to test this, run bwa index test.fa with a downsampled fasta file in a test directory, you will see the same general pattern of files produced. [3] http://www.gnu.org/software/make/manual/make.html#Rule-Introduction

jts commented 8 years ago

What I meant is that the code that I linked to in step 5 looks correct (the prereq is draft_genome.fasta) :

# index the draft assembly for bwa
draft_genome.fasta.bwt: draft_genome.fasta
    bwa index $<

Can you link me to the exact line on the github version of the file that you think is causing problems?

ParsaAkbari commented 8 years ago

Sure, actually line 193 seems to have been changed on my local version. I probably caused this error myself somehow (how embarrassing haha). But I am still confused about an error on line 203? And a few other bits and pieces that I noticed but managed to fix. Perhaps could patch these up on the github version. Let me know if I am missing something (rather likely).

https://github.com/jts/nanopore-paper-analysis/blob/master/full-pipeline.make#L104 need the --no-check-certificate flag

https://github.com/jts/nanopore-paper-analysis/blob/master/full-pipeline.make#L166 https://github.com/jts/nanopore-paper-analysis/blob/master/full-pipeline.make#L167 should be raw.reads.corrected.fasta rather than raw.reads.corrected.corrected.fasta

https://github.com/jts/nanopore-paper-analysis/blob/master/full-pipeline.make#L170 Filepath to the binary fastqToCA is missing should be: ./wgs-8.2/Linux-amd64/bin/fastqToCA and same for runCA: https://github.com/jts/nanopore-paper-analysis/blob/master/full-pipeline.make#L170

https://github.com/jts/nanopore-paper-analysis/blob/master/full-pipeline.make#L203 I get the following error:

  bwa mem -t 4 -x ont2d draft_genome.fasta raw.reads.np.fasta | samtools view -Sb - | samtools sort -f - reads_to_draft.sorted.bam
  sort: invalid option -- 'f'
  Usage: samtools sort [options...] [in.bam]
  Options:
    -l INT     Set compression level, from 0 (uncompressed) to 9 (best)
    -m INT     Set maximum memory per thread; suffix K/M/G recognized [768M]
    -n         Sort by read name
    -o FILE    Write final output to FILE rather than standard output
    -T PREFIX  Write temporary files to PREFIX.nnnn.bam
    -@, --threads INT
               Set number of sorting and compression threads [1]
        --input-fmt-option OPT[=VAL]
                 Specify a single input file format option in the form
                 of OPTION or OPTION=VALUE
    -O, --output-fmt FORMAT[,OPT[=VAL]]...
                 Specify output format (SAM, BAM, CRAM)
        --output-fmt-option OPT[=VAL]
                 Specify a single output file format option in the form
                 of OPTION or OPTION=VALUE
        --reference FILE
                 Reference sequence FASTA FILE [null]
jts commented 8 years ago

Regarding reads.corrected.corrected.fasta: the Makefile is correct, we perform two rounds of error correction.

Regarding the path to CA: we set the PATH environment variable so the full path does not need to be given:

https://github.com/jts/nanopore-paper-analysis/blob/master/full-pipeline.make#L13

The samtools sort error is probably also related to the path - if you are not running the Makefile your path will not be updated to include the latest samtools (which we download) so the -f flag is not found.