I think this is better to be included in the python wrapper than the snakemake routine, since it generates an undisclosed number of files, and dynamic rules can be a bit of a pain in snakemake so I don't feel it worth it.
Basically, take the top hits from microcins csv file and put it in a file with the following structure "microcin.{basename(sample)}.pep" with header looking like ">contig|start:stop:strand"
The sequences can be sorted by bit score so that the users know which is the best hit. Also, if any of these are not included in the three signal matches up or downstream of CvaB, those can be put in a "signalMatch_nearCvaB.pep" or something.
I'm imagining this would require a bit of merging with the nr_csv file and the all_hits file as well as the signalMatch...
I think this is better to be included in the python wrapper than the snakemake routine, since it generates an undisclosed number of files, and dynamic rules can be a bit of a pain in snakemake so I don't feel it worth it.
Basically, take the top hits from microcins csv file and put it in a file with the following structure "microcin.{basename(sample)}.pep" with header looking like ">contig|start:stop:strand"
The sequences can be sorted by bit score so that the users know which is the best hit. Also, if any of these are not included in the three signal matches up or downstream of CvaB, those can be put in a "signalMatch_nearCvaB.pep" or something.
I'm imagining this would require a bit of merging with the nr_csv file and the all_hits file as well as the signalMatch...