Open cmorganl opened 3 years ago
ClipKit parameters and settings have been benchmarked using treesapp evaluate
. The following code is used to calculate a single error value for the classifications across all taxonomic ranks, weighted by the number of ranks to the correct taxon (i.e. taxonomic distance):
for f in *_evaluate*/final_outputs/clade_exclusion_performance.tsv
do
echo $f
cat $f | awk '{sum+=$5*$7;} END {print sum;}'
done
The parameter set with the lowest score will be used as the default.
ClipKIT is a new MSA-trimming Python package. The authors indicate the trimmed MSAs generated by ClipKIT are more "desireable" (combined RF distance and bipartition supports) than those from competing tools, including BMGE.
Using ClipKit instead of BMGE would also clean up the installation process, by not having to package the BMGE.jar file with TreeSAPP. It could instead be installed using pip or conda.
treesapp/sub_binaries/
directory, and support for BMGE.jar