Closed nick-youngblut closed 3 years ago
Thank you for your suggestion. I will make PStrain support metaphlan3 in the future. I will collect comments and update a new version as soon as possible.
Thank you for making this tool open source. Just checking in any update to support metaphlan3 database yet?
Additionaly, my query is related to prior database that is provided with this tool. I assume (as per the supplement section 1.2) it is specifically generated from the marker sequences from metaphlan2 database right?
I have reformatted metaphlan3 database (for species of my interest) to make it compatible with PStrain, but can I use default prior database with the output? If not can you provide some guidance/ instruction on how to prepare it (I am only focusing on very few bacteria species so could be easy to prepare on my end).
Thank you
Hi,
Thank you for your concern.
Up to now, I have not updated PStrain to support metaphlan3 yet. Sorry for the delay.
For the prior database, it is indeed generated for the marker genes from metaphlan2 database. However, you can use it directly. It will assume the prior beta is 0.5 if not finding information from the prior database. The performance is good enough without the prior database except for the gene with too low depth.
Also, you can generate the prior database by yourself. You can follow these steps:
locus A T C G gi|378760316|ref|NZ_AGDG01000038.1|:8694-9629|Bacteroides_faecis_517 1 0 0 0 gi|410936685|ref|NZ_AJRF02000012.1|:54814-55320|Enterococcus_faecium_299 0 1 0 0 gi|139439993|ref|NZ_AAVN02000016.1|:7311-8633|Collinsella_aerofaciens_783 0.0102827763496144 0.0205655526992288 0 0.969151670951157 gi|239633764|ref|NZ_GG670286.1|:165060-166319|Enterococcus_gallinarum_626 0 0 0 1 gi|345651640|ref|NZ_JH114383.1|:276898-279936|Bacteroides_vulgatus_2725 0 0 0 1 GeneID:16385027|Lactococcus_phage_jm2_95 1 0 0 0 gi|488673303|ref|NZ_KB850950.1|:c879018-878179|Clostridium_hathewayi_511 0 0 0 1
import pickle popu={} f=open('prior_beta.txt','r') for line in f: break for line in f: line=line.strip() array=line.split() atcg=array[1:] popu[array[0]]=atcg with open('prior_beta.pickle', 'wb') as f: pickle.dump(popu, f, pickle.HIGHEST_PROTOCOL)
Please let me know if it helps.
Best, Shuai
Thank you for your prompt reply and useful instruction. That was helpful. Best,
Are there any plans to support metaphlan3 databases?
Sorry for the delay, now PStrain supports metaphlan3. You can use PStrain_V30.py to do this.
Are there any plans to support metaphlan3 databases?