vpc-ccg / pamir

Discovery and Genotyping of Novel Sequence Insertions in Many Sequenced Individuals
BSD 3-Clause "New" or "Revised" License
8 stars 4 forks source link

Pamir hanging at mrsfast_anchor_wg_map #50

Closed Krannich479 closed 4 years ago

Krannich479 commented 4 years ago

Hi, I am using your latest pamir commit 148e2cffbdc9d241129830950466d234ab727735 and the snakemake pipeline broke down with an Error in rule mrsfast_anchor_wg_map Error: Cannot Open the file <path/to/reference>/chr21_ins.fa.index In the folder where I provided the reference (chr21) is the chromosome's FASTA and it seems that pamir created a FAI (FASTA index) but there is no ".index" file. Can you see what is going on here?

Also, since the fix of issue https://github.com/vpc-ccg/pamir/issues/47 I noticed tons of warnings about inconsistent paths in the snake rules like path /<workdir>//analysis/<project-name>/001-pamir-remove-concordants/samplename17/samplename17.stat contains double '/'. This is likely unintended. It can also lead to inconsistent results of the file-matching approach used by Snakemake. Not sure if this is related to the downstream breakdown of mrsfast_anchor_wg_map.

My config.yaml looks like:

path:
    /<workdir>/
raw-data:
    raw-data-links
reference:
    /<path/to/reference>/chr21_ins.fa
population:
    project-name
input:
 "samplename1":
  - S0001.bam
[...]
 "samplename50":
  - S0050.bam
f0t1h commented 4 years ago

Hi, early in the development we decided not to index the provided index in the snakemake pipeline, because it was likely to override existing index. Can you try indexing the reference by running

mrsfast --index genome.fa

I thought I resolved the double "/" errors. I will look in to it.

Krannich479 commented 4 years ago

Okay, I sent a pull request https://github.com/vpc-ccg/pamir/pull/53 that successfully handled the missing reference.fa.index.


The other matter was the relative paths: if path in the config.yaml does not end with a "/" then the Snake-pipeline merges the relative paths of e.g. path+reference without the "/" delimiter and messes up the output repositories. The quick solution is to hard-code a "/" to the end of path in config.yaml.