james-cole / brainageR

Software for generating a brain-predicted age value, using Gaussian Processes regression, implemented in R
GNU Lesser General Public License v3.0
75 stars 27 forks source link

How to deal with age bias in my test dataset #16

Open Adamsuno opened 2 weeks ago

Adamsuno commented 2 weeks ago

I am using brainageR in my dataset on those with depression. I am dealing with ~600 participants. To correct for age bias, do you recommend using chronological age as a covariate in my model or using the proposed solution by Beheshti in 2019 (https://www.sciencedirect.com/science/article/pii/S2213158219304103) which uses the regression line slope and intercept to correct for age bias.

Because the paper explains that you need to find out the age bias in a training set, that means I would have to take away 20% of my sample to find out the age bias and apply it to the 80% of the remaining data. However, if I use age as a covariate, I do not have to eliminate any data for further analyses.

What are your thoughts on the pros and cons of the method used to correct for age bias in my dataset?

Thank you very much, Adam Sunavsky. MD/PhD Student UBC

james-cole commented 2 weeks ago

Hi Adam,

I recommend using age as a covariate in any subsequent analyses with brain-age gap as a measure. This has the same effect as explicit bias correction, but is less complicated.

One caveat to be mindful of is that if your variable of interest (e.g., group, behavioural score) is also related to age, then you’re likely to be confounded. However, this is case whether covarying or using explicit age-correction. Hopefully that’s not the case for you analysis.

Best wishes, James

From: Adamsuno @.> Date: Wednesday, 30 October 2024 at 11:42 pm To: james-cole/brainageR @.> Cc: Subscribed @.***> Subject: [james-cole/brainageR] How to deal with age bias in my test dataset (Issue #16)

⚠ Caution: External sender

I am using brainageR in my dataset on those with depression. I am dealing with ~600 participants. To correct for age bias, do you recommend using chronological age as a covariate in my model or using the proposed solution by Beheshti in 2019 (https://www.sciencedirect.com/science/article/pii/S2213158219304103) which uses the regression line slope and intercept to correct for age bias.

Because the paper explains that you need to find out the age bias in a training set, that means I would have to take away 20% of my sample to find out the age bias and apply it to the 80% of the remaining data. However, if I use age as a covariate, I do not have to eliminate any data for further analyses.

What are your thoughts on the pros and cons of the method used to correct for age bias in my dataset?

Thank you very much, Adam Sunavsky. MD/PhD Student UBC

— Reply to this email directly, view it on GitHubhttps://github.com/james-cole/brainageR/issues/16, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACCUV6EIJD5GYBTZKG6UTXTZ6FVGJAVCNFSM6AAAAABQ5GPL5OVHI2DSMVQWIX3LMV43ASLTON2WKOZSGYZDKNRQGQ2DKNI. You are receiving this because you are subscribed to this thread.Message ID: @.***>