linnabrown / run_dbcan

Run_dbcan V4, using genomes/metagenomes/proteomes of any assembled organisms (prokaryotes, fungi, plants, animals, viruses) to search for CAZymes.
http://bcb.unl.edu/dbCAN2
GNU General Public License v3.0
138 stars 40 forks source link

More explanations of the output files #61

Closed iaunicorn closed 3 years ago

iaunicorn commented 3 years ago

Hi, Thanks for developing such a great pipeline. Could you provide more explanations about the output files? I got the following results: cgc.gff cgc.out diamond.out hmmer.out Hotpep.out overview.txt stp.out tf-1.out (?) tf-2.out (?) tp.out (?) uniInput

I did not get the explanations of output files marked with ?. Thanks in advance.

linnabrown commented 3 years ago

tf-1 tf-2 and tp.out are middle file from hmmscan for searching tf-1,2 and tp database. You can ignore them because they are middle files.

Get Outlook for iOShttps://aka.ms/o0ukef


From: iaunicorn notifications@github.com Sent: Monday, November 23, 2020 11:39:33 AM To: linnabrown/run_dbcan run_dbcan@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [linnabrown/run_dbcan] More explanations of the output files (#61)

Hi, Thanks for developing such a great pipeline. Could you provide more explanations about the output files? I got the following results: cgc.gff cgc.out diamond.out hmmer.out Hotpep.out overview.txt stp.out tf-1.out (?) tf-2.out (?) tp.out (?) uniInput

I did not get the explanations of output files marked with ?. Thanks in advance.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/linnabrown/run_dbcan/issues/61, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACMHALSXSBDHM72XU3EK26LSRKF4LANCNFSM4T7XEKQQ.

iaunicorn commented 3 years ago

Thanks so much.

maracashay commented 3 years ago

I'm going to jump in on this thread because I also would like to have more information about the overview.txt file. Enzymes are presented as "GH23(3-117)" for HMMER and "GH23(208)" for Hotpep. What do the values in the () for HMMER and Hotpep refer to?

Thanks!

yinlabniu commented 3 years ago

GH23(3-117) means the GH23 domain starts at 3 and ends at 117, according to hmmer against dbCAN HMMdb.

Hotpep predicts CAZymes using a different algorithm. GH23(208) means hotpep thinks the query is a GH23 and subfam 208. Note this subfamily 208 has nothing to do with CAZy's subfam. CAZy only has subfam info for about less than 30 CAZyme families, while PPR (which is hotpep's database) has classified all CAZyme families into subfamilies and these subfam IDs do not match with CAZy's 30 families. More info can be found in the Hotpep.out file.

Yanbin


From: Mara Cloutier notifications@github.com Sent: Friday, December 4, 2020 9:58 AM To: linnabrown/run_dbcan Cc: Subscribed Subject: Re: [linnabrown/run_dbcan] More explanations of the output files (#61)

I'm going to jump in on this thread because I also would like to have more information about the overview.txt file. Enzymes are presented as "GH23(3-117)" for HMMER and "GH23(208)" for Hotpep. What do the values in the () for HMMER and Hotpep refer to?

Thanks!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_linnabrown_run-5Fdbcan_issues_61-23issuecomment-2D738862760&d=DwMCaQ&c=Cu5g146wZdoqVuKpTNsYHeFX_rg6kWhlkLF8Eft-wwo&r=f65eEPN7tgPSqkv5z4zNJA&m=LV2tDJ-DV_tL3uxWDItQnRto-cG3052_8Gc7rLBYo0Y&s=B0tR_tCehIv44P9Pniv6g0XFv6cVx-Xlr1D5czhadp0&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AEXNKZV2DVNXF7XACZTJ3XTSTEBJDANCNFSM4T7XEKQQ&d=DwMCaQ&c=Cu5g146wZdoqVuKpTNsYHeFX_rg6kWhlkLF8Eft-wwo&r=f65eEPN7tgPSqkv5z4zNJA&m=LV2tDJ-DV_tL3uxWDItQnRto-cG3052_8Gc7rLBYo0Y&s=SU_tJrqsb7vMiAsGHvzonZUKxprP3xNZAn4QmIbEVeQ&e=.

maracashay commented 3 years ago

Great. Thank you so much!