If the traits file contain only a subset of genomes of the Roary file, Scoary currently exits with a KeyError.
If you want to run Scoary on just a subset of the genomes that you ran Roary on (You might be missing phenotypic data for some isolates for example), there are currently two ways of handling this:
Using --restrict_to, pointing to a csv file which lists only the genomes you want to include.
Editing the Roary file by column-wise deletion of the genomes you don't have in your traits file. (Scoary doesnt use the summary statistics in the first columns of the Roary file, so this will not impact analysis)
Plan:
Should not throw a KeyError. Implement a formal check that the names in the two files are identical. Then, if one file contains fewer genomes, analyze only the subset, but give warning.
If the traits file contain only a subset of genomes of the Roary file, Scoary currently exits with a KeyError.
If you want to run Scoary on just a subset of the genomes that you ran Roary on (You might be missing phenotypic data for some isolates for example), there are currently two ways of handling this:
Using --restrict_to, pointing to a csv file which lists only the genomes you want to include.
Editing the Roary file by column-wise deletion of the genomes you don't have in your traits file. (Scoary doesnt use the summary statistics in the first columns of the Roary file, so this will not impact analysis)
Plan: