jtleek / sva-devel

28 stars 45 forks source link

Runtime of Combat #54

Open WarrenSu2022 opened 2 years ago

WarrenSu2022 commented 2 years ago

Hello! I have two data matrix, one has 586 varaibles and 1039 samples from 7 batches, the other has 5050 variables and 1039 samples from 7 batches. When I used Combat to process the first matrix, it has been running for 10 hours and do not come to an end. I'm wondering if it is the matter of large amount of batches and how long it will last to process these two matrix. Thank you.

wevanjohnson commented 2 years ago

Do you have any continuous variables in the mix? Remove them and see how it goes. Right now ComBat only manages factor covariates, so if you give it a continuous covariate it will make classes for each value of the continuous variable. \

On Apr 19, 2022, at 8:55 AM, WarrenSu2022 @.***> wrote:

Hello! I have two data matrix, one has 586 varaibles and 1039 samples from 7 batches, the other has 5050 variables and 1039 samples from 7 batches. When I used Combat to process the first matrix, it has been running for 10 hours and do not come to an end. I'm wondering if it is the matter of large amount of batches and how long it will last to process these two matrix. Thank you.

— Reply to this email directly, view it on GitHub https://github.com/jtleek/sva-devel/issues/54, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACMBWPGMUCSZRMU2T6K6INDVF3CPTANCNFSM5TZD5DRA. You are receiving this because you are subscribed to this thread.

WarrenSu2022 commented 2 years ago

Do you have any continuous variables in the mix? Remove them and see how it goes. Right now ComBat only manages factor covariates, so if you give it a continuous covariate it will make classes for each value of the continuous variable. \ On Apr 19, 2022, at 8:55 AM, WarrenSu2022 @.***> wrote: Hello! I have two data matrix, one has 586 varaibles and 1039 samples from 7 batches, the other has 5050 variables and 1039 samples from 7 batches. When I used Combat to process the first matrix, it has been running for 10 hours and do not come to an end. I'm wondering if it is the matter of large amount of batches and how long it will last to process these two matrix. Thank you. — Reply to this email directly, view it on GitHub <#54>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACMBWPGMUCSZRMU2T6K6INDVF3CPTANCNFSM5TZD5DRA. You are receiving this because you are subscribed to this thread.

Thanks. I haven’t got any covarate, so my mod is NULL. The data matrix need to be done by combat are all continuous variables. The codes are as follows: Whole_Cohort_wave_Combat = ComBat(dat = Whole_Cohort_wave, batch = batch_info$Batch, par.prior=TRUE, ref.batch = "FUSCC_Aurora") And the progress it notes are: Using batch =FUSCC_Auroraas a reference batch (this batch won't change) Found7batches Adjusting for0covariate(s) or covariate level(s) Standardizing Data across genes Fitting L/S model and finding priors Finding parametric adjustments