Open fkiraly opened 1 year ago
I vaguely remember issues like this in skpro v1; I am not sure we had an elegant solution for it though - I guess this is where overloading or multiple dispatch would come in handy (porting to Mojo or Julia anyone? :smile:) Short of that implementing different methods is probably the easiest solution, although not quite as elegant.
hm, @frthjf, how or why would dispatch solve the issue? not sure whetehr there's sth obvious to dispatch on.
Could you not dispatch based on the type/shape of x
; if it's a special case, you call that instead of falling back onto approximate solutions.
no, because x
would have the same shape and type for univariate and multivariate energy. In the univariate case, if you receive a multivariate x
, you would do the same broadcasting across variables as current.
I see. It sounds to me like using different methods would be best in that case.
Currently, the
energy
method, if called on multivariate distributions (multiple columns), implements the 1-norm energy which is not strictly proper - as it is just the sum of marginal energies, and therefore is minimized by any multivariate distribution that has correct marginals (non-uniquely!). This is due to the default handling of multivariate which is column averaging or summation.k-norms with k>1 are strictly proper afaik, but they do not fit the current interface which assumes column means/sums.
We may have to add a param to the
energy
function, or even a new method for multivariate energy - not sure what the best is design-wise.The key issue is that the 1-norm and 2-norm energies often have closed form solutions or at least known ones that are efficient to compute, whereas other k may or may not have these.
For the extender contract and tag inspection, it means we must be able to cope with the situation where we may want to implement efficient special cases and leave the other cases to approximate routines.
Any good ideas? (@Alex-JG3, @frthjf)