Arcadia-Science / 2023-nr-clustering

Clustering the NCBI nr database with mmseq2 (90% length, 90% identity). Inspired by the NCBI's experimental ClusteredNR database.
MIT License
22 stars 2 forks source link

Basic performance metrics #5

Open d-kk opened 1 year ago

d-kk commented 1 year ago

Hi,

We would like to use this dataflow and it would be great if we could know what type of machine you used (especially nproc and memory) and how long it took to run the dataflow. We'll spec our VM accordingly.

Thanks,

David

taylorreiter commented 1 year ago

Hi David, sorry for taking so long to respond, I missed the notification for this issue.

I was...very sloppy with my benchmarking of this workflow. I tried a million an twelve things before I settled on this workflow, in part because of bugs in taxonkit that have now been resolved.

d-kk commented 1 year ago

Thanks appreciate the info. 🙂