hoelzer-lab / ribap

A comprehensive bacterial core gene-set annotation pipeline based on Roary and pairwise ILPs
GNU General Public License v3.0
19 stars 3 forks source link

Prokka fails with input genomes containing a dot ('.') in their file names #44

Closed klamkiew closed 1 year ago

klamkiew commented 1 year ago
Chlamydia_muridarum_strain_Nigg3_full_genome.fna                             Chlamydia_muridarum_str._Nigg3_CMUT3-5_strain_Nigg3_CMUT3-5_full_genome.fna  Chlamydia_muridarum_str._Nigg_strain_Nigg_full_genome.fna
Chlamydia_muridarum_str._Nigg_2_MCR_strain_Nigg_2_MCR_full_genome.fna        Chlamydia_muridarum_str._Nigg_CM972_strain_CM972_full_genome.fna

Renaming of the files and fasta records works fine, however, the prokka directory stops at the str., removes the . and we end up with Chlamydia_muridarum_str_RENAMED/ several times, leading to the following error:

Error executing process > 'roary (5)'

Caused by:
  Process `roary` input file name collision -- There are multiple input files for each of the following file names: Chlamydia_muridarum_str_RENAMED.gff

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`
klamkiew commented 1 year ago

already working on it, btw; I changed the rename.sh script a little bit to remove dots from file names. currently testing whether it works :D

klamkiew commented 1 year ago

ah; the error comes from here in the ribap.nf

 else if (params.fasta) { fasta_input_ch = Channel
    .fromPath( params.fasta, checkIfExists: true)
    .map { file -> tuple(file.simpleName, file) }

cause simpleName cuts at the first . ... jeez.

klamkiew commented 1 year ago

@hoelzer changed the line above to

.map { file -> tuple(file.baseName, file) }

if someone inputs ugly filenames, the output names will be ugly as well, but it works :)