Conflicting starting meanFA values for each site depending on the direction of the harmonization.

pall75 commented 2 years ago

I have DTI data from two different studies A and B and I want to know the best direction in which to apply the harmonization: either A -> B or B -> A.

I set up two symmetric harmonization pipelines: One pipeline to harmonise A -> B and another to harmonise B -> A. I used the same training datasets is both cases.

Surprisingly, when I compared the meanFA bar plots after the training each pipeline I got conflicting results. The original meanFA for each group depends on the pipeline that I choose to harmonise the data.

In one direction, the starting meanFA is higher in study A, in the other pipeline the starting meanFA is higher in study B.

The only thing I change between the two pipelines is to swap the “reference" and “target” labels in the harmonisationzation.py commands.

PastedGraphic-2

PastedGraphic-1

Any ideas?

yrathi commented 2 years ago

Could you share more details about the data sets? e.g. Do both the sites have the same spatial resolution? How many gradient directions at each site?

tashrifbillah commented 2 years ago

Hi Pedro--I haven't set up such symmetric pipelines ever. Here's an idea before further investigation--you should set up the two pipelines in separate input data so nothing from pipeline 1 can collude into pipeline 2. I mean make two sets of directories--AtoB with reference and target and BtoA with target and reference. But I understand that you will use the same matching training data. Also use separate --template folders. Run again and see if your observation changes.

pall75 commented 2 years ago

Could you share more details about the data sets? e.g. Do both the sites have the same spatial resolution? How many gradient directions at each site?

Both data sets were acquired with the same resolution using the same scanner (GE Signa) The number of gradient directions and the b-values were different. (30 dirs at bval=1200, and 60dirs at bval=1300).

Hi Pedro--I haven't set up such symmetric pipelines ever. Here's an idea before further investigation--your should set up the two pipelines in separate input data so nothing from pipeline 1 can collude into pipeline 2. I mean make two sets of directories--AtoB with reference and target and BtoA with target and reference. But I understand that you will use the same matching training data. Also use separate --template folders. Run again and see if your observation changes.

I used complete separated folders to setup and run each pipeline as you mentioned. The only thing that was common between the pipelines was the git folder containing the dMRIharmonization source code. Should I also download a separate git folder for the source code?

tashrifbillah commented 2 years ago

Should I also download a separate git folder for the source code?

That won't make any difference.

yrathi commented 2 years ago

Can you also let us know how many subjects are at each site? (a more naive question though -- why are you harmonizing data from the same scanner? or is it same scanner but different site that you are trying to harmonize? -- the later requires harmonization but the former doesnt).

pall75 commented 2 years ago

Can you also let us know how many subjects are at each site?

I am training the pipelines on 20 controls from each site. There are another 200 subjects in each study.

(a more naive question though -- why are you harmonizing data from the same scanner? or is it same scanner but different site that you are trying to harmonize? -- the later requires harmonization but the former doesnt).

Both studies used the same physical scanner. The reason for doing the harmonization is that we found a significant effect between the controls from both studies (after covariating for age and gender). The effect size was small to medium (partial etta squared = 0.03)

yrathi commented 2 years ago

We will be checking the code in the next couple of weeks and will get back to you. Thank you for pointing this out to us.

tashrifbillah commented 2 years ago

I am training the pipelines on 20 controls from each site.

Did you observe the same asymmetry on just the controls?

pall75 commented 2 years ago

I am training the pipelines on 20 controls from each site.

Did you observe the same asymmetry on just the controls?

Yes, I have only looked at the training data so far (20 Controls)

tashrifbillah commented 2 years ago

Okay how about you dropbox us the 40 NIFTI images so we have the best input to investigate this problem?

pall75 commented 2 years ago

Okay how about you dropbox us the 40 NIFTI images so we have the best input to investigate this problem?

I will check the data sharing agreements set up for our project.

yrathi commented 2 years ago

@pall75 -- We have been unable to replicate this effect in any of our datasets. So, we are wondering if there is something very specific to your dataset.

pall75 commented 2 years ago

@pall75 -- We have been unable to replicate this effect in any of our datasets. So, we are wondering if there is something very specific to your dataset.

I wonder the same thing. It could be that the difference between the meanFA over the skeleton is quite small for my datasets (Δ ~0.01), at least compared to the test data from the Connectom/Prisma (Δ ~0.12)).

pall75 commented 2 years ago

There is also a discrepancy between the mean FA stats obtained with debug_fa.py and the stats obtained with fa_skeleton_test.py. In debug_fa.py the ants transformation that register the maps to MNI are applied in two steps within two separate calls to antsApplyTransforms (). In fa_skeleton_test.py all transformations are applied with the same antsApplyTransforms. When I used the fa_skeleton_test.py to produce the stats, the inconsistency between the starting meanFA values depending on the direction disappeared. Would it not make sense using a single call to antsApplyTransforms when computing the stats also in debug_fa.py ?

tashrifbillah commented 2 years ago

When I used the fa_skeleton_test.py to produce the stats, the inconsistency between the starting meanFA values depending on the direction disappeared.

Thank you for the observation.

Would it not make sense using a single call to antsApplyTransforms when computing the stats also in debug_fa.py ?

Hi @suheyla2 , Pedro is talking about your change. I did them in single step before.

suheyla2 commented 2 years ago

@pall75 yes thank you for the observation, I want to note that these two scripts: debug_fa.py and fa_skeleton_test.py are not part of the harmonization, just provided for a quick sanity check. Thanks again! @tashrifbillah @yrathi I think two steps registration works better for the noisy data (based on the experiments that I ran in the past). As long as we are consistent, one registration might be fine as well. More importantly, we can consider computing the average fa in the whole brain rather than only on the skeleton if the skeleton is affected due to slight differences in registration.

pnlbwh / dMRIharmonization

Conflicting starting meanFA values for each site depending on the direction of the harmonization. #78