linnabrown / run_dbcan

Run_dbcan V4, using genomes/metagenomes/proteomes of any assembled organisms (prokaryotes, fungi, plants, animals, viruses) to search for CAZymes.
http://bcb.unl.edu/dbCAN2
GNU General Public License v3.0
131 stars 40 forks source link

PUL annotation question #116

Open LuckyFeiX opened 1 year ago

LuckyFeiX commented 1 year ago

Dear Developers,

This software is helpful, but its new V4 parameters are too complex. I want to know the PUL in my genome, but I don't know if the following shell is right. I can run it but cannot find the result about PUL.

run_dbcan bin.13.faa protein -c bin.13.gff --dia_cpu 25 --hmm_cpu 25 --tf_cpu 25 --stp_cpu 25 --out_pre bin.13 --out_dir bin.13 --use_signalP=TRUE -sp /mnt/sdb/software/interproscan-5.57-90.0/bin/signalp/4.1/signalp --db_dir /mnt/sdb/database/cazydb/db/ --pul /mnt/sdb/database/cazydb/db/PUL.faa

Could you tell me which file I should see the results in? Or tell me how to modify the parameters. Thank you~

zhengzhengzhj commented 1 year ago

Dear user, The basic command to get the PUL and their potential substrate is "run_dbcan EscheriaColiK12MG1655.fna meta --out_dir samples --cluster 1 --cgc_substrate". If all the databases are in the "db" folder of the working directory. You will find all the result files in "samples" folder.

For your specific inputs, you can use this command: "run_dbcan bin.13.faa protein -c bin.13.gff --dia_cpu 25 --hmm_cpu 25 --tf_cpu 25 --stp_cpu 25 --out_pre bin.13 --out_dir bin.13 --use_signalP=TRUE -sp /mnt/sdb/software/interproscan-5.57-90.0/bin/signalp/4.1/signalp --db_dir /mnt/sdb/database/cazydb/db/ --cgc_substrate " option --cgc_substrate will enable predicting substrate for all the potential PUL predicted by cgc_finder.

yinlabniu commented 1 year ago

The substrate prediction is in sub.prediction.out.


From: zhengzhengzhj @.> Sent: Sunday, April 23, 2023 7:19 PM To: linnabrown/run_dbcan @.> Cc: Subscribed @.***> Subject: Re: [linnabrown/run_dbcan] PUL annotation question (Issue #116)

Non-NU Email


Dear user, The basic command to get the PUL and their potential substrate is "run_dbcan EscheriaColiK12MG1655.fna meta --out_dir samples --cluster 1 --cgc_substrate". If all the databases are in the "db" folder of the working directory. You will find all the result files in "samples" folder.

For your specific inputs, you can use this command: "run_dbcan bin.13.faa protein -c bin.13.gff --dia_cpu 25 --hmm_cpu 25 --tf_cpu 25 --stp_cpu 25 --out_pre bin.13 --out_dir bin.13 --use_signalP=TRUE -sp /mnt/sdb/software/interproscan-5.57-90.0/bin/signalp/4.1/signalp --db_dir /mnt/sdb/database/cazydb/db/ --cgc_substrate " option --cgc_substrate will enable predicting substrate for all the potential PUL predicted by cgc_finder.

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/linnabrown/run_dbcan/issues/116*issuecomment-1519207001__;Iw!!PvXuogZ4sRB2p-tU!CUeG9L73hj1H0tpDlDoed6P9KNoVRAGjL5sSX0-tNBkv769ulw2No9uO4lhYPicccgB6YbqJ-pQ-H2U5jfcKFw$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AEXNKZSMXH7JRXJBM3XVG3DXCXBJFANCNFSM6AAAAAAXH4XVEE__;!!PvXuogZ4sRB2p-tU!CUeG9L73hj1H0tpDlDoed6P9KNoVRAGjL5sSX0-tNBkv769ulw2No9uO4lhYPicccgB6YbqJ-pQ-H2Xgsnp3WQ$. You are receiving this because you are subscribed to this thread.Message ID: @.***>