luntergroup / octopus

Bayesian haplotype-based mutation calling
MIT License
302 stars 38 forks source link

Index file is not recognized #91

Closed maxulysse closed 4 years ago

maxulysse commented 4 years ago

Describe the bug Using symbolic links and Index file is not recognized. A symbolic link for the fasta and index files are in the folder, but both files are originally in different folder, and octopus seems to read the target and not the link itself.

Error message:

[2019-10-29 15:58:07] <INFO> ------------------------------------------------------------------------
[2019-10-29 15:58:07] <INFO> octopus v0.5.2-beta
[2019-10-29 15:58:07] <INFO> Copyright (c) 2015-2018 University of Oxford
[2019-10-29 15:58:07] <INFO> ------------------------------------------------------------------------
[2019-10-29 15:58:07] <EROR> A user error has occurred:
[2019-10-29 15:58:07] <EROR> 
[2019-10-29 15:58:07] <EROR>     No associated index file could be found for the fasta file
[2019-10-29 15:58:07] <EROR>     "/home/maxime/workspace/nf-core_sarek/work/stage/72/dba06837df24af3a54c7e8c1e6fa69/human_g1k_v37_decoy.small.fasta".
[2019-10-29 15:58:07] <EROR> 
[2019-10-29 15:58:07] <EROR> To help resolve this error ensure that a valid fasta index (.fai) exists
[2019-10-29 15:58:07] <EROR> in the same directory as the given fasta file. You can make one with the
[2019-10-29 15:58:07] <EROR> 'samtools faidx' command.
[2019-10-29 15:58:07] <INFO> ------------------------------------------------------------------------

Command Command line to run octopus:

$ octopus -R human_g1k_v37_decoy.small.fasta -I 9876T.recal.bam -C cancer

Desktop (please complete the following information):

Additional context

pwd
/home/maxime/workspace/nf-core_sarek/work/f8/b00dc4a60a7d90a46fd65187590df9

ls -l
lrwxrwxrwx 1 maxime maxime  91 Oct 29 16:58 9876T.recal.bai -> /home/maxime/workspace/nf-core_sarek/work/82/debcec5fbc4ccc0fbb92986cf3a137/9876T.recal.bai
lrwxrwxrwx 1 maxime maxime  91 Oct 29 16:58 9876T.recal.bam -> /home/maxime/workspace/nf-core_sarek/work/82/debcec5fbc4ccc0fbb92986cf3a137/9876T.recal.bam
lrwxrwxrwx 1 maxime maxime 113 Oct 29 16:58 human_g1k_v37_decoy.small.fasta -> /home/maxime/workspace/nf-core_sarek/work/stage/72/dba06837df24af3a54c7e8c1e6fa69/human_g1k_v37_decoy.small.fasta
lrwxrwxrwx 1 maxime maxime 111 Oct 29 16:58 human_g1k_v37_decoy.small.fasta.fai -> /home/maxime/workspace/nf-core_sarek/work/39/f1e261725c40756c0bcb17e4f241a6/human_g1k_v37_decoy.small.fasta.fai
dancooke commented 4 years ago

Octopus resolves symbolic links, and since the resolved index file lives in a different directory to the reference file, you get the error. I'm unsure whether or not to change this behaviour. I've never seen a reference index file appear in a different directory to the reference file. What is your reason for this setup?

maxulysse commented 4 years ago

I do have a totally legitimate reason for that actually. Most of the time fasta and index is in the same place, but sometimes I only have fasta, so I automatically generate the index. And as it's done within a Nextflow process, the source file is link, so the index is created in a different folder.

dancooke commented 4 years ago

Fair enough. This is a simple enough change. I'm just wondering if there are any reasons against not resolving symlinks (e.g. any performance implications). I can add a command line option either way.

maxulysse commented 4 years ago

Any solution would be fantastic as long as I can use it with an install from bioconda of course ;-) But I can probably help with that issue

dancooke commented 4 years ago

The default behaviour for resolving symlinks was changed in 5f1990882c7da31b9c3b664e265aad0e0f501153. Use the new option --resolve-symlinks to get previous behaviour. I believe this resolves this issue but please re-open if not.