NBISweden / GenErode

GitHub repository for GenErode, a Snakemake pipeline for the analysis of whole-genome sequencing data from historical and modern samples to study patterns of genome erosion.
GNU General Public License v3.0
21 stars 7 forks source link

Repeatmasker fails when attempting to zip the *cat file #45

Closed verku closed 1 year ago

verku commented 1 year ago

Submitted via email:

The main issue seems to be the following error: gzip: reference_genomic.upper.fasta.cat: No such file or directory

...

For example, when I run repeatmasker, it creates a folder with the path ../GenErode/reference_genomes/repeatmasker/reference_genomic/reference_genomic.upper.fasta.preSatMar42206472023.RMoutput, and the reference_genomic.upper.fasta.cat file is found in that folder. The job then fails saying the ".cat" file does not exist.

verku commented 1 year ago

Probably due to the use of relative path to reference genome in config.yaml that results in the following code and variables inserted in the shell that can't be executed:

# Check if *.cat file is compressed or uncompressed
if [ ! -f reference_genomes/repeatmasker/reference_genomic/reference_genomic.upper.fasta.cat.gz ]
then
  gzip reference_genomic.upper.fasta.cat
fi

This can be solved by the user by providing the full path to the reference genome. Better would be an update of the code to avoid such errors in the future.

verku commented 1 year ago

The issue was solved by providing the full path to the reference genome