vnmabus / dcor

Distance correlation and related E-statistics in Python
https://dcor.readthedocs.io
MIT License
144 stars 26 forks source link

Incorrect documentation about arbitrary dimensions #61

Open alexge233 opened 9 months ago

alexge233 commented 9 months ago

Hello,

The documentation seems to suggest that I can pass n-dim arguments to distance_correlation, however as soon as I pass a 41318, 2, 5 tensor, I get errors. Reading #50 seems to suggest that I need to reshape the input. Reshaping by flattening the inner dimensions fixes the assertion errors, but this leads me to believe that the function does not actually implement n-dim arguments. Row-wise calculations as suggested imply that I need to unroll the dimensions manually to do the calculations? In which case, IMHO the documentation isn't really accurate of the methods implemented.

Thanks for the library by the way, much appreciated!

vnmabus commented 9 months ago

Sorry for the terminology. You can only pass 1D or 2D arguments, as you noticed. The first dimension is the number of observations of a random variable/random vector, while the second one is the dimension of the random vector (THIS dimension is what is arbitrary).

I would love to make this functionality more gufunc-like (and thus remove the need for rowwise), but I had not time to do so. I am also a bit hesitant because the most similar functionality in NumPy, np.corrcoef, is NOT a gufunc and in fact has an opposite convention for the dimensions' meaning. If you have feedback regarding this, please feel free to share it.

alexge233 commented 8 months ago

Hi Carlos,

I haven’t had time to dwell into it since I opened the issue, but I’m gonna be working on dcor again soon, so I’ll see if I can get you some feedback.

Best Regards, Alex

On 26 Jan 2024, at 08:24, Carlos Ramos Carreño @.***> wrote:



Sorry for the terminology. You can only pass 1D or 2D arguments, as you noticed. The first dimension is the number of observations of a random variable/random vector, while the second one is the dimension of the random vector (THIS dimension is what is arbitrary).

I would love to make this functionality more gufunchttps://numpy.org/doc/stable/reference/c-api/generalized-ufuncs.html#generalized-universal-function-api-like (and thus remove the need for rowwisehttps://dcor.readthedocs.io/en/latest/functions/dcor.rowwise.html), but I had not time to do so. I am also a bit hesitant because the most similar functionality in NumPy, np.corrcoefhttps://numpy.org/doc/stable/reference/generated/numpy.corrcoef.html, is NOT a gufunc and in fact has an opposite convention for the dimensions' meaning. If you have feedback regarding this, please feel free to share it.

— Reply to this email directly, view it on GitHubhttps://github.com/vnmabus/dcor/issues/61#issuecomment-1911657647, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABO73A2A3JRI5U23Q2XT6L3YQNR45AVCNFSM6AAAAABBWNYG4GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJRGY2TONRUG4. You are receiving this because you authored the thread.Message ID: @.***>