phac-nml / staramr

Scans genome contigs against the ResFinder, PlasmidFinder, and PointFinder databases.
Apache License 2.0
113 stars 26 forks source link

Add support for multiple types of biological data files #52

Open apetkau opened 5 years ago

apetkau commented 5 years ago

It would be nice if staramr could support multiple types of input files (such as Genbank) and also compressed versions of each of this files (e.g., gzipped fasta). As an example, see the description of input for Abricate.

Conversion between different formats can likely use BioPython's SeqIO functionality.

Detection of file formats should also not depend on the extension (e.g., .fasta for fasta, .gz for gzipped) since this tool is integrated into Galaxy, which internally names all input files as .dat. Ideally, the file contents should be used to detect the type of file passed to staramr instead of the extension.

apetkau commented 5 years ago

We can leave this for a later release (as it's not as high of priority) so I'm switching back to unassigned.