Order of dataset ttest2

MauriGiorgio commented 1 year ago

Hi, I am doing an analysis about the differences between two groups of objects (group A and group B) with the ttest2 (unpaired ttest) in Matlab but I have noticed that it gives me different results depending on the order in which I enter the inputs ( A and B or B and A). Is it correct? if it was correct, in what order should I enter the inputs?

0todd0000 commented 1 year ago

Swapping the order of the input arguments like this:

spm1 = spm1d.stats.ttest2(yA, yB);
spm2 = spm1d.stats.ttest2(yB, yA);

should yield opposite results in the sense that the t statistics should have opposite signs. Thus the following statement should be true at all time points:

spm1.z == -spm2.z

Other than this no difference is expected. If you are indeed seeing different results please paste code and/or a figure into this issue thread.

MauriGiorgio commented 1 year ago

Thanks for the reply. For the parametric test, the results don't change (The results have opposite signs and the same t value). However, for the non parametric test, if I change the order of A and B, the results are different, in particular the t values calculated by the code change. The code that I am using is here: A=[]; A=table2array(Y(1:sss , 3:103 )); B=[]; B=table2array(Y(sss+1:end , 3:103 )); %(1) Conduct non-parametric test: rng(0) alpha = 0.05; two_tailed = true; iterations = 1000; snpm = spm1d.stats.nonparam.ttest2(A, B); snpmi = snpm.inference(alpha, 'two_tailed', two_tailed, 'iterations', iterations); disp('Non-Parametric results') disp( snpmi ) %(2) Compare to parametric inference: spm = spm1d.stats.ttest2(A, B); spmi = spm.inference(alpha, 'two_tailed',two_tailed); disp('Parametric results') disp( spmi ) % plot: figure('position', [0 0 1000 300]) subplot(121); spmi.plot(); spmi.plot_threshold_label(); spmi.plot_p_values(); subplot(122); snpmi.plot(); snpmi.plot_threshold_label(); snpmi.plot_p_values(); title(nomepar, 'Interpreter', 'none'); Here there are two examples, the plots on the left of each image are about the parametric test and ones on the right are about the non parametric test.

example 1: mom_hip_flex_SANI-TRAINT0

mom_hip_flex_TRAINT0-SANI

example 2: trunk_flex_SANI-TRAINT0

trunk_flex_TRAINT0-SANI

Il giorno gio 20 apr 2023 alle ore 05:43 Todd Pataky < @.***> ha scritto:

Swapping the order of the input arguments like this:

spm1 = spm1d.stats.ttest2(yA, yB); spm2 = spm1d.stats.ttest2(yB, yA);

should yield opposite results in the sense that the t statistics should have opposite signs. Thus the following statement should be true at all time points:

spm1.z == -spm2.z

Other than this no difference is expected. If you are indeed seeing different results please paste code and/or a figure into this issue thread.

— Reply to this email directly, view it on GitHub https://github.com/0todd0000/spm1d/issues/256#issuecomment-1515670162, or unsubscribe https://github.com/notifications/unsubscribe-auth/A62FTSSY3LWZPSBFWBEQP6TXCCWFTANCNFSM6AAAAAAXEGO6H4 . You are receiving this because you authored the thread.Message ID: @.***>

0todd0000 commented 1 year ago

The results will change for nonparametric permutation tests even if you don't change the input order. Permutation test results are based on randomly permuted observations so the results are not expected to be constant unless you both (a) keep all inputs constant and (b) control the random number generator (RNG) seed / state.

MauriGiorgio commented 1 year ago

The two database (A and B) that I use in input are kept constant, but the results when I change their order (A and B or B and A) in the code are different. In particular the two results show different values of the form of t*.

Il giorno mer 3 mag 2023 alle ore 12:46 Todd Pataky < @.***> ha scritto:

The results will change for nonparametric permutation tests even if you don't change the input order. Permutation test results are based on randomly permuted observations so the results are not expected to be constant unless you both (a) keep all inputs constant and (b) control the random number generator (RNG) seed / state.

— Reply to this email directly, view it on GitHub https://github.com/0todd0000/spm1d/issues/256#issuecomment-1532817620, or unsubscribe https://github.com/notifications/unsubscribe-auth/A62FTSQEU3TK7TBLQXHW4Y3XEIZRJANCNFSM6AAAAAAXEGO6H4 . You are receiving this because you authored the thread.Message ID: @.***>

0todd0000 commented 1 year ago

Try not changing anything, even the variable order, and just run the nonparametric inference code several times. You will find that the t value is not constant. This means that non-constant t is caused by the algorithm itself and not by the variable order.

MauriGiorgio commented 1 year ago

I've tried running the code several times without changing anything, but the value of t doesn't change in the non-parametric test. The value of t changes only if I change the order of databases from A-B to B-A. (Values of data in A and B are always the same).

Il giorno mer 3 mag 2023 alle ore 19:39 Todd Pataky < @.***> ha scritto:

Try not changing anything, even the variable order, and just run the nonparametric inference code several times. You will find that the t value is not constant. This means that non-constant t is caused by the algorithm itself and not by the variable order.

— Reply to this email directly, view it on GitHub https://github.com/0todd0000/spm1d/issues/256#issuecomment-1533445344, or unsubscribe https://github.com/notifications/unsubscribe-auth/A62FTSRYNMQFXS4VFEIPIY3XEKJ33ANCNFSM6AAAAAAXEGO6H4 . You are receiving this because you authored the thread.Message ID: @.***>

0todd0000 commented 1 year ago

Please try running the following code.

% create dataset:
yA = randn(10,1);
yB = randn(10,1);

% conduct non-parametric test several times:
rng(0)
alpha      = 0.05;
two_tailed = true;
iterations = 1000;
for i = 1:5
    ti = spm1d.stats.nonparam.ttest2(yA, yB).inference(alpha, 'two_tailed', two_tailed, 'iterations', iterations);
    disp( ti.p );
end

You should see that the result ti.p changes each time the test is run. The results should be:

If this does not help to solve your problem, please post code replicating the problem into this thread.

MauriGiorgio commented 1 year ago

By inserting the databases that I have to compare in my research, into the code you sent me, it stops at the second result:

% create dataset: yA = A; yB = B;

% conduct non-parametric test several times: rng(0) alpha = 0.05; two_tailed = true; iterations = 1000; for i = 1:5 ti = spm1d.stats.nonparam.ttest2(yA, yB).inference(alpha, 'two_tailed', two_tailed, 'iterations', iterations); disp( ti.p ); end

0.0190    0.0170

0.0240    0.0240

In my research I have to compare these two datasets that contain the kinetic data of a gait analysis parameter: A= cases to study (9 people) B= control cases (11 people) (I leave here the databases A and B) As I explained earlier, if I do the non-parametric 1d t-test2 I always get the same t value. If I change the order of A and B the t value changes.

Il giorno mar 9 mag 2023 alle ore 10:55 Todd Pataky < @.***> ha scritto:

Please try running the following code.

% create dataset: yA = randn(10,1); yB = randn(10,1);

% conduct non-parametric test several times: rng(0) alpha = 0.05; two_tailed = true; iterations = 1000; for i = 1:5 ti = spm1d.stats.nonparam.ttest2(yA, yB).inference(alpha, 'two_tailed', two_tailed, 'iterations', iterations); disp( ti.p ); end

You should see that the result ti.p changes each time the test is run. The results should be:
0.4940

0.5160

0.4840

0.5140

0.4580
If this does not help to solve your problem, please post code replicating the problem into this thread.

— Reply to this email directly, view it on GitHub https://github.com/0todd0000/spm1d/issues/256#issuecomment-1539294257, or unsubscribe https://github.com/notifications/unsubscribe-auth/A62FTSXL6PIFTZOTAVQPO5DXFIBB5ANCNFSM6AAAAAAXEGO6H4 . You are receiving this because you authored the thread.Message ID: @.***>

0todd0000 commented 1 year ago

As I explained earlier, if I do the non-parametric 1d t-test2 I always get the same t* value.

This is only possible if you are either (a) controlling the random number generator state, or (b) running all iterations. In either case it is not a problem, it is a feature of this non-parametric analysis approach.

If I change the order of A and B the t* value changes.

Again, this may not necessarily be a problem, it may instead simply be a reflection of this non-parametric analysis approach.

I can only answer your questions more specifically if you attach specific code and specific results. Please copy-and-paste your code, and please add numbers to the following statements:

I always get the same t* value.

What is the t* value?

If I change the order of A and B the t* value changes.

What is the t* value for the AB case?
What is the t* value for the BA case?
Does the t* value change if you run the AB and/or BA cases multiple times?

MauriGiorgio commented 1 year ago

Here there is the code that I use (A and B are always the same and thet're the ones that I have sent before):

A=[]; A=table2array(Y(1:sss , 3:103 ));

B=[]; B=table2array(Y(sss+1:end , 3:103 ));

%(1) Conduct non-parametric test:

rng(0)

alpha = 0.05;

two_tailed = true;

iterations = 1000;

snpm = spm1d.stats.nonparam.ttest2(A, B);

snpmi = snpm.inference(alpha, 'two_tailed', two_tailed, 'iterations', iterations);

disp('Non-Parametric results')

disp( snpmi )

%(2) Compare to parametric inference:

spm = spm1d.stats.ttest2(A, B);

spmi = spm.inference(alpha, 'two_tailed',two_tailed);

disp('Parametric results')

disp( spmi )

% plot:

figure('position', [0 0 1000 300])

subplot(121); spmi.plot(); spmi.plot_threshold_label(); spmi.plot_p_values();

subplot(122); snpmi.plot(); snpmi.plot_threshold_label(); snpmi.plot_p_values();

title(nomepar, 'Interpreter', 'none');

1) If I use the order A-B (like in the code above) the t* value is always 2.253 .

 If I change the order B-A in the code the t* value is always 2.212.

I leave here the images of the plots of each case.

Order A-B Order B-A

2) In both cases, the value of t* doesn't change if i run several time the code.

Here there are A and B: A.xlsx B.xlsx

Il giorno mar 9 mag 2023 alle ore 12:20 Todd Pataky < @.***> ha scritto:

As I explained earlier, if I do the non-parametric 1d t-test2 I always get the same t* value.

This is only possible if you either (a) controlling the random number generator state, or (b) running all iterations. In either case it is not a problem, it is a feature of this non-parametric analysis approach.

If I change the order of A and B the t* value changes.

Again, this may not necessarily be a problem, it may instead simply be a reflection of this non-parametric analysis approach.

I can only answer your questions more specifically if you attach specific code and specific results. Please copy-and-paste your code, and please add numbers to the following statements:

I always get the same t* value.

What is the t* value?

If I change the order of A and B the t* value changes.

What is the t* value for the AB case?

What is the t* value for the BA case?

Does the t* value change if you run the AB and/or BA cases multiple times?

— Reply to this email directly, view it on GitHub https://github.com/0todd0000/spm1d/issues/256#issuecomment-1539876400, or unsubscribe https://github.com/notifications/unsubscribe-auth/A62FTSWJ4QP772U75MVZNT3XFIK53ANCNFSM6AAAAAAXEGO6H4 . You are receiving this because you authored the thread.Message ID: @.***>

0todd0000 commented 1 year ago

If I use the order A-B (like in the code above) the t* value is always 2.253

This is because you are using rng to control the random number generator state. Try changing rng(0) to rng(1) or rng(2) or rng(n) where n is any positive integer. The t* value should change.

Identically, if you do the following, you should see that t* changes.

rng(0)
alpha      = 0.05;
two_tailed = true;
iterations = 1000;
snpm1      = spm1d.stats.nonparam.ttest2(A, B);
snpm2      = spm1d.stats.nonparam.ttest2(A, B);
snpm1i     = snpm1.inference(alpha, 'two_tailed', two_tailed, 'iterations', iterations);
snpm2i     = snpm2.inference(alpha, 'two_tailed', two_tailed, 'iterations', iterations);
disp( snpm1i )
disp( snpm2i )

MauriGiorgio commented 1 year ago

Ok, thank you very much

Il giorno mar 9 mag 2023 alle ore 13:44 Todd Pataky < @.***> ha scritto:

If I use the order A-B (like in the code above) the t* value is always 2.253

This is because you are using rng to control the random number generator state. Try changing rng(0) to rng(1) or rng(2) or rng(n) where n is any positive integer. The t* value should change.

— Reply to this email directly, view it on GitHub https://github.com/0todd0000/spm1d/issues/256#issuecomment-1540007461, or unsubscribe https://github.com/notifications/unsubscribe-auth/A62FTSSX2UE27EMGKBR4UVLXFIU2VANCNFSM6AAAAAAXEGO6H4 . You are receiving this because you authored the thread.Message ID: @.***>

0todd0000 / spm1d

Order of dataset ttest2 #256