andrewjpage / tiptoft

Predict plasmids from uncorrected long read data
GNU General Public License v3.0
39 stars 10 forks source link

Usage question #25

Closed conte1 closed 5 years ago

conte1 commented 5 years ago

I realize this tool is designed for detecting plasmids, but I'm wondering if it could be modified for more general purposes such as detecting which samples had particular distinct sequences. There is surprisingly a lack of tools that do this using a kmer approach.

I'm assuming it would not be as simple as providing these distinct sequences to the "--plasmid_data" parameter?

andrewjpage commented 5 years ago

Yes it should work with any FASTA file using the --plasmid_data parameter. It parses the header of the sequence something like "rep1.1_repE(pAMbeta1)_AF007787" and extracts out information to make the output pretty but you can of course repurpose it for your own needs. Let me know how you get along. I've used the same fundamental method in Krocus https://github.com/andrewjpage/krocus for calling 7 MLST genes