Open ShailNair opened 2 years ago
Hi Shail
Can you send me a snapshot of your clusters.tsv file? And how many samples do you have? I expect that there might a problem in the naming of your contigs.
The framework is designed to use the Sample-IDs from the header of contigs to keep track of where each viral-bin is from.
best, Joachim
@joacjo I used a single co-assembled contigs file for binning. Here is the snapshot of the cluster.tsv file and phamb generated fna file
The cluster.tsv file has 1101140 records.
I followed the How to Run - not in parallel - quick and dirty tutorial.
Thank you.
Hi Shail
Ah I see. The names of the entries in vamb_bins.fna matches the VAMB-cluster names. Remember the bins in the .fna
file are concats of the VAMB cluster sequences.
Example: In your clusters.tsv you might have a cluster with multiple contigs:
cluster contig 99999 c_000000123 99999 c_000000321
If this cluster is predicted putative viral, the resulting name in the .fna
file will be: 99999
Does this make sense?
Best, Joachim
Thanks. that makes sense. Thanks for this very helpful tool. We could extract a three times higher number of complete viral contigs (as per CheckV's rule) with PHAMB in comparison to VirSorter2, DeepVirFinder and viralVerify.
Hi,
My assembled contigs have headers as
which matches with the VAMB bin headers. But when I run PHAMB, I get bin headers as :
How to get the PHAMB contig headers in the initial VAMB bin headers format?.