Open gedw99 opened 1 month ago
Nice that there is a golang package for this. Didn't know about this one. Should work if you are using covariance matrices (positive definite eigenvalues). I would still suggest to use the regularization to make it more stable.
Thanks for the advice @HaukeBartsch
I am still learning about the approaches for this.
I will try out the golang approach . Can you point me to any data sources I can try and so then also benchmark it . Gold included benching tooling
In order to benchmark the solution I would start with a problem that is relevant to you? Let's say you have a spreadsheet with lots of rows you want to check. How many of those would you need to check without this algorithm? Compute the ranking and see if the results make sense. You could of course also benchmark things like memory and compute time, stability etc.. As an example for applying this and similar approaches you can look at the ENIGMA (enigma.ini.usc.edu/) projects protocols for data QC. Don't expect too much, there are plenty of examples, especially if you have a low number of records where you see 'variable' results.
thanks @HaukeBartsch
Yes a fake data benchmark makes sense to get going. One that you have some ground truth / answer to.
the Enigma Protocol look useful
https://enigma.ini.usc.edu/protocols/imaging-protocols/
https://github.com/orgs/ENIGMA-git/repositories
BTW did you see the Mahalanobis fund this ? https://github.com/gonum/gonum/blob/master/stat/statmat.go#L133C6-L133C17
https://github.com/gonum/gonum/
Has
https://github.com/gonum/gonum/blob/master/stat/statmat.go
just a bit easier perhaps .
I also work with Dicons and AI