smaegol / PlasFlow

Software for prediction of plasmid sequences in metagenomic assemblies
GNU General Public License v3.0
94 stars 28 forks source link

Viruses? #19

Open tenguzame opened 5 years ago

tenguzame commented 5 years ago

Hello, I did some test runs with the software, and I noticed that it could not discriminate between prokaryotic and viral sequences. As a result, it often classifies viral sequences as plasmids. I guess that it would be nice to have the chance to train it to exclude viruses.

smaegol commented 5 years ago

Hi, this is likely to occur, as phages were not filtered from the training dataset and they can share some features with plasmids. We comment on that in the article. I currently work on the version with viral sequences in the training model, but, due to other projects I'm involved in, it may take long time.

If you want to classify simultaneously viral and plasmid sequences maybe PPR-Meta is an option: https://github.com/zhenchengfang/PPR-Meta ?