Closed Kurtj-hub closed 2 years ago
Tested with var_2021. Got this warning upon calculate_clade_counts()
call:
Warning message: In CheckNameReservedWord(name, check) : Name 'root' is a reserved word as defined in NODE_RESERVED_NAMES_CONST. Using 'root2' instead.
I compared the results of calculate_clade_counts()
using the old and the updated versions. I believe the numbers are identical in across the tables, which is great. However, I see some odd naming results and a few other issues (all of the following is in regards to raw_clade output, raw_taxon seems identical):
Great work on the optimization! The values seem to be correct and the speed is incredibly quick. All that remains seems to be some formatting and edge cases. I suggest testing with a few other datasets (available from the DB access app) and comparing the outputs of the old and new calculate_clade_counts. Additionally, it's probably good to run some of the subsequent analyses using the clade table (e.g. ANCOM) to check if they're still working.
I will be gone for August so feel free to merge an update once you find the issues are addressed
Awesome thanks @LLansing for getting this done before your break.
I appreciate the feedback I was uncertain about the relevance of the other columns as I thought they were mostly just artifacts. I will make the changes ASAP, and I hope you enjoy your break.
Attempted Speed-up for the clade count procedure.
Rough Comparison on my system (No specific benchmarking for the Oxy dataset) old - 1-2 minutes. new - 2-3 seconds.
Output file slightly changed can be re-modified if required.