KatyBrown / CIAlign

MIT License
117 stars 9 forks source link

File naming for the output #49

Open dgslos opened 1 year ago

dgslos commented 1 year ago

The output file is now based on a 'stem' for the naming. I suggest it might be better to use a suffix to add to the name of the input file instead by default and let the user decide another name if provided. This has several advantages: 1) The user can run CIAlign in a loop with the same config file and in the same directory for multiple alignments. 2) Less change to overwrite previous results 3) It's easier to observe which alignement was processed with CIAligned

KatyBrown commented 1 year ago

Hi, Thanks very much for using CIAlign.

I'm not sure I completely understand your request - currently the outfile_stem parameter (which defaults to CIAlign) provides a prefix for all output files. The command line parameters take precedence over the config file, so by adding an outfile_stem in the command line it is possible to run CIAlign in a loop. _cleaned.fasta should also be appended on the name of all output files. Do you mean a random suffix on each iteration?

Thanks, sorry for not understanding! Katy

dgslos commented 1 year ago

No, this is not what I meant. It's much easier to put it in a loop as follows:

  1. Take the filename as 'basis' for the stem based on infile, for example file.x.fasta
    Default behavior without specifying stem:
    infile = " file.x.fasta"
    stem =  infile.split('.', -1)[0] + '_CIAlign'

    The stem would become "file.x_CIAlign" -> meaningfull since you would know which file was processed with CIAlign. This is easier in a loop because else you need to parse the stem in an argument for each file in the loop.