HIT-ImmunologyLab / PHISDetector

5 stars 2 forks source link

customed microbial genomes #1

Open Thexiyang opened 3 years ago

Thexiyang commented 3 years ago

Thanks for the nice tool! Is it possible for this tool run against the customed microbial genomes?

gancao commented 3 years ago

@Thexiyang Thanks for your usage! Did you mean predict hosts in microbial bacterial genomes? In PHISDetector, the database was in-built, composed of complete bacterial genomes and shotgun sequencing genomes including microbial bacterial genomes. Maybe we should consider how to provide a document for users to build personal database afterwards. Thanks for your advice!

Thexiyang commented 3 years ago

Thanks for the reply. Exactly what I am looking for. I have both virome and host datasets sequenced from the same environment. I think this will increase the matching accuracy of host and viruses. Hope this can be done very soon!

gancao commented 3 years ago

Building the bacterial database need many dependencies and software, the main process can be concluded as the followings: (1) CRISPRCasFinder, PILER-CR, CRT for CRISPR detection (2) DBSCAN-SWA, Phage_Finder for prophage detection (3) diamond, blast to make comparisons between phage sequences and bacterial sequences (4) WiSH, VirHostMatcher to calculate oligonucleotide similarity (5) optinal, annotate protein domain using rpsblast with CDD database downloaded from NCBI after the above prediction, take the CRISPR, prophage sequences to build protein or nucleotide database so we provided a pre-built database finished in our linux environment. If you pay attention to several phages and hosts, you can try our webserver(http://www.microbiome-bigdata.com/PHISDetector/index/predict)

Thexiyang commented 3 years ago

Thanks! I saw that '3.Evaluate the interaction between the query phage and bacterium' fits to this aim. But it is not scalable to large datasets. Did you also have any established pipeline to combine the above for easy database construction? I think NCBI does not contain all the host genomes and will be also helpful to check the local data.

gancao commented 3 years ago

I am sorry for the failure of providing a well-established pipeline, and the users need to install the softwares as I replied. But I will improve it afterwards and send messages to you.

甘草

18883990694@163.com | 签名由网易邮箱大师定制 On 2/25/2021 16:31,dongxy23notifications@github.com wrote:

Thanks! I saw that '3.Evaluate the interaction between the query phage and bacterium' fits to this aim. But it is not scalable to large datasets. Did you also have any established pipeline to combine the above for easy database construction? I think NCBI does not contain all the host genomes and will be also helpful to check the local data.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

Thexiyang commented 3 years ago

Any update on how to build a personal database? Thanks!