linnabrown / run_dbcan

Run_dbcan V4, using genomes/metagenomes/proteomes of any assembled organisms (prokaryotes, fungi, plants, animals, viruses) to search for CAZymes.
http://bcb.unl.edu/dbCAN2
GNU General Public License v3.0
130 stars 40 forks source link

How to summarize dbCAN3 results #172

Open NishatTamana51 opened 2 months ago

NishatTamana51 commented 2 months ago

I have run dbCAN3 for my fungal whole proteome data (and also for some other stain for this species) using HMMER, dbCAN_sub and DIAMOND tool. For the results, I kept those predicted by >=2 of these tools as suggested by dbCAN3. I want to do comparison among different strains of my fungal species by reporting the number of different CAZyme families and interpreting the differences. How can I do that? I mean, how can I count the CAZyme families?

linnabrown commented 2 months ago

I did not calculate abundance of CAZyme families before. This is my first guess. If I am wrong please correct me @yinlabniu .Our overview results provided families domain annotations for each protein sequence. If you provide multiple sequences from one genome of a species, just count the families with the same family name. If you are care about the subfamily, you should count subfamily. If not, you can just count the family.