mariarizzo / energy

energy package for R
https://mariarizzo.github.io/energy/index.html
43 stars 7 forks source link

Normalisation of E-statistic #3

Closed NikTheGeek1 closed 4 years ago

NikTheGeek1 commented 4 years ago

Hi guys,

This is actually not an issue with the package, more like an issue I am having when using your package. I am trying to normalise the e-statistic using your suggestion in Energy distance (2016; pg3) but keep failing. In your article, it's written that one of the ways to normalise the statistic is to divide by an estimate of 2E||X-Y||. It should be straightforward but I can't figure it out. Also in your article, you are using as an example the iris dataframe: eqdist.etest(iris[1:100, 1], c(50, 50), R = 999) where you get E-statistic = 123.5538. It would be terribly helpful if you could send me how you would normalise this result in R.

Thanks Nikos

mariarizzo commented 4 years ago

Nikos,

An estimate of $2E||X-Y||$ is 2A from page 28: $$ A = \frac{1}{nm} \sum{i=1}^n \sum{j=1}^m | x_i – y_j |. $$

Divide $\mathcal E{n,m}$ (not T) by 2A. The statistic $\mathcal E{n, m}$ estimates the numerator in H. The statistic T is the V-statistic for the test.

In this example, you could write a loop, or use outer(), or subset the distance matrix to get A:

dst <- as.matrix(dist(iris[1:100, 1:4]))

xydst <- dst[51:100, 1:50] mean(xydst) [1] 3.301223

Regards,

Maria

From: Nikos Theodoropoulos notifications@github.com Reply-To: mariarizzo/energy reply@reply.github.com Date: Wednesday, January 8, 2020 at 7:07 AM To: mariarizzo/energy energy@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [EXTERNAL] [mariarizzo/energy] Normalisation of E-statistic (#3)

Hi guys,

This is actually not an issue with the package, more like an issue I am having when using your package. I am trying to normalise the e-statistic using your suggestion in Energy distance (2016; pg3) but keep failing. In your article, it's written that one of the ways to normalise the statistic is to divide by an estimate of $2E||X-Y||$. It should be straightforward but I can't figure it out. Also in your article, you are using as an example the iris dataframe: eqdist.etest(iris[1:100, 1], c(50, 50), R = 999) where you get E-statistic = 123.5538. It would be terribly helpful if you could send me how you would normalise this result in R.

Thanks Nikos

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/mariarizzo/energy/issues/3?email_source=notifications&email_token=AA7UGDGP2U2LMPFXA4KHNADQ4W6YLA5CNFSM4KEHLHI2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IEX2S4A, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AA7UGDBGXH4LQ33GSUFP7GLQ4W6YLANCNFSM4KEHLHIQ.

NikTheGeek1 commented 4 years ago

Thanks for the prompt reply.

So if I am getting it right, calculating B and C in the same way (as in page 28): $$ B = \frac{1}{nn} \sum{i=1}^n \sum{j=1}^n | x_i – yj |. $$ $$ C = \frac{1}{mm} \sum{i=1}^m \sum_{j=1}^m | x_i – y_j |. $$ then $(2A-B-C) / (2A) = 0.7485335$, which is the normalised e-statistic for that example, right?

It makes sense, thanks Nikos

mariarizzo commented 4 years ago

Nikos,

Yes, that is correct. Alternately you could get the numerator by dividing T by the coefficient (nm)/(m+n) (=50 in this case).

Maria Rizzo, Professor Department of Mathematics & Statistics Bowling Green State University

From: Nikos Theodoropoulosmailto:notifications@github.com Sent: Wednesday, January 8, 2020 10:04 AM To: mariarizzo/energymailto:energy@noreply.github.com Cc: Maria Rizzomailto:mrizzo@bgsu.edu; Commentmailto:comment@noreply.github.com Subject: [EXTERNAL] Re: [mariarizzo/energy] Normalisation of E-statistic (#3)

Thanks for the prompt reply.

So if I am getting it right, calculating B and C in the same way (as in page 28): $$ B = \frac{1}{nn} \sum{i=1}^n \sum{j=1}^n | x_i – yj |. $$ $$ C = \frac{1}{mm} \sum{i=1}^m \sum_{j=1}^m | x_i – y_j |. $$ then $(2A-B-C) / (2A) = 0.7485335$, which is the normalised e-statistic, right?

It makes sense, thanks Nikos

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/mariarizzo/energy/issues/3?email_source=notifications&email_token=AA7UGDDR53GVJKKU73GSCSDQ4XTN7A5CNFSM4KEHLHI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIM3HIA#issuecomment-572109728, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AA7UGDGRYYDG5OTD5N5KXO3Q4XTN7ANCNFSM4KEHLHIQ.