Closed asierFernandezP closed 1 year ago
Thanks for the awesome questions!
For annotating phages, my recommendation would be to use low identity value parameters (e.g., --id 30) as phage proteins often show poor conservation at the aa level. Doing this, you should be able to find contigs where more than one phage hit has occurred. You could additionally filter contigs to remove those with hits to other element types as described above. The metadata linked here might be of help in this. See the pipeline homepage for more details.
Please do let me know if you have follow-up questions, or if we can provide any additional scripting. this will help us make these kinds of analyses more accessible.
Connor
Thanks for your help!
Hello, thanks for this amazing and useful tool! I just wanted to say that I wish these links were on the main page, the UsageGuidance.md.
More info on the headers can be found here: https://fralinlifesci.vt.edu/content/dam/fralinlifesci_vt_edu/ciwars/mobileOG-db_UserGuidance_v1.6.pdf
The metadata linked here might be of help in this. See the pipeline homepage for more details.
thanks again for your help!
Hi,
Thanks for the great database and the scripts provided.
I have a few general questions about the output files and recommendations on how to interpret them:
First, would you recommend to use the full database or only the version containing manually curated + homologue sequences? Is the classification of the remaining proteins reliable (keyword data)? I have tried using your tool in a few contigs using the curated + homology DB and some proteins are given a NA as mobileOG Category (even in some manually curated sequences). Why does this happen?
Also, I would like to have a more detailed explanation of the output files. E.g. contig_file_summary.csv: I do not completely understand the output of this file. I would expect that each row corresponds to one contig (although contig names are not displayed), but in my case there are less rows than contigs.
From a list of potential phage/viral contigs, I am interested in determining which of these contigs could be potential plasmids and mobile elements to discard them, as I want to keep only phage sequences. Which annotations (or how many) should be present in a contig to confidently classify it as a mobile element or as a plasmid?
Thank you, Asier