Open hyphaltip opened 2 years ago
We can simply update the urls -- this actually gets fetched at runtime if there is internet connection so doesn't require users to update the codebase to update the urls. But might be good to create a new branch to do this on in case there are any differences in file format, parsing, etc needed. Also feel free to add any extra processing if improves results -- would this require more dependencies?
doing the full run_dbscan would not be installing their software (a python script) which would be a dependency. But it is run like this using install of @linnabrown https://github.com/linnabrown/run_dbcan
run_dbcan.py --db_dir $CAZY_FOLDER --out_dir $OUT.run_dbcan --tools all \
--stp_cpu $CPUS --hotpep_cpu $CPUS --hmm_cpu $CPUS --dia_cpu $CPUS --tf_cpu $CPUS \
$INFILE protein
https://github.com/nextgenusfs/funannotate/blob/00c207ad4041270a5e0e1f6cff711f697c5d3abc/funannotate/downloads.json#L6
Do we want to try to point to v9? I can update but not sure what else would be broken or if you want to make sure this change goes on a new point release/branch @nextgenusfs ? https://bcb.unl.edu/dbCAN2/download/dbCAN-HMMdb-V9.txt
I also have been running hotpep and run_dbcan full runs get more specificity in predictions - not sure if you want to consider it.