Closed Liqueurdefehling closed 1 year ago
Hi @Liqueurdefehling ,
Though this might sound confusing in first place, it is actually the expected behavior. Platon was designed to classify draft contigs and thus extract plasmid-borne contigs. In order to do so, one can adjust sensitivity/specificity values by running Platon in either sensitivity
, accuracy
or specificity
mode via the --mode
parameter.
In addition and besides the above described normal operation, one can also use Platon in order to characterize (NOT classify) all plasmids via --characterize
.
In that context, the above behavior is expected since in characterization mode, Platon executes the full characterization pipeline which is why all contigs are handled as plasmid-borne. I agree that in this case the output might be misleading and this might deserve a little bit of improvement.
Thank you for explanation, now makes sense. I had similar results using the --characterize
option. All the contigs were written into the <prefix>.plasmid.fasta
file while the <prefix>.chromosome.fasta
file was empty. Can this be chaged so the plasmids that had hits will be automatically written into a file for further use?
Great tool anyways , thanks a lot!
G
I have tried to comapre the two outputs with (bottom, secon cat) and without (top, first cat) the --characterize
option and I am not sure how to interpret the result. Wht the 2 contigs NODE_5 and NODE_11
that were included in the <prefix>.plasmid.fasta
then are not marked as having any plasmid hits, even when using the --characterize
option. Thanks a lot. G
.
Hi @Gian77 ,
the --characterize
option simply conducts all characterization tasks without filtering for or predicting any plasmid/chromosome inference. It's just a convenience option to characterize all contigs.
If you'd like to predict plasmid-borne contigs, then you should use Platon in the default mode w/o --characterize
. In your example NODE_5
and NODE_11
are predicted to be plasmid-borne.
Hey @oschwengers,
thanks for the explanation, very useful. I am still confused, though, about what the # Plasmid Hits
field means in the --characterize
mode of platon. I have several contigs that have 1 in the characterize mode in that field, should't they match with what predicted in the default mode?
Thanks much! Gian
Hi @Gian77 , wel, it depends. Sure, a small contig can have a BLAST+ hit against a reference plasmid. But this might also be a small part of a mobile element or a fragment thereof, for example an IS, transposon or even just a transposase. To filter out these maybe false-positives, Platon screens for contigs with a sufficiently-high RDS. Only after this initial screening step, remaining contigs are characterized. By this, we can significantly speed up the entire process.
Hello @oschwengers , Thanks for the explanation. So this means that the characterization may be not 100% correct due to the reasons you mention above, while the default mode it is correct since is performed after the screening for possible false positives. In the end I shoudl trust the default mode results, correct? Thanks a lot, Gian
Well, not exactly. The characterization is correct in terms of the descriptions. This step does not classify by any means, it merely provides all information on all contigs.
For an actual classification (chromosome/plasmid), you should use Platon in the default (accuracy
) mode.
ok @oschwengers, will look into the manual. I think I did not specified accuracy
mode when I run it. Thanks a lot,
Gian
Hi I am testing Platon 1.6 on the E. coli chromosome accession number CP027572.1 as well as bacterial chromosomes CP045233.1 and CP011509.1. platon [–c] --db /env/ig/biobank/by-soft/platon/1.6/db/ --output …/test_ecoli_c/ --verbose …/ecoli.fasta There is no output when running in accuracy mode. When launched in –c mode, I get a table with one row, the ID being the sequence ID and the RDS being negativ, and the chromosome.fasta file is empty whereas the sequence is in the plasmid.fasta file. The same thing happens when I try an input file containing both chromosomes and plasmids sequences, every sequences are in the plasmid.fasta file. Any idea on what I might be missing ? Best regards