Closed oojo12 closed 1 year ago
Is this more or less all? If so I'm thinking of writing a benchmarking.md for future contributors
There's also:
BenchmarkId
, include the value used to parametrize the benchmark. For example, if we're benchmarking with feature counts of 5 and 10, then the ID should include "5" and "10" for those benchmarks.I will say back when I was performing ml related task it wasn't uncommon to have more than 10 features for predictive analysis especially for tree-based algorithms. However, if I'm honest I don't recall if this was before or after dimensionality reduction. I'll go with your guidance for documentation we can always revise later.
I've read that many classic ML algorithms don't do well with high feature counts (something called the "curse of dimensionality"), which is why dimensionality reduction is needed. I'm not sure about the exact numbers though.
I think it would be helpful to the community if we documented what we wanted for benchmarking assessments. From my experience we ideally want the following: