HelenaLC / muscat

Multi-sample multi-group scRNA-seq analysis tools
166 stars 33 forks source link

paired samples #12

Closed BingbingYuan closed 5 years ago

BingbingYuan commented 5 years ago

This is a very nice package.
Is there a way to take paired samples into consideration? I couldn't find the option in pbDS. For example, in the design below, the C1 and T1 samples were from the same animal, same for the C2 and T2.

  | sample_id | group_id | n_cells 1 | C1 | control | 3521 2 | C2 | control | 2489 3 | T1 | treated | 3015 4 | T2 | treated | 2890

Thanks,

markrobinsonuzh commented 5 years ago

@BingbingYuan you can create a new column for the sample_id and include that in the design matrix .. perhaps something like this:

label | sample_id | group_id | n_cells C1 | S1 | control | 3521 C2 | S2 | control | 2489 T1 | S1 | treated | 3015 T2 | S2 | treated | 2890

design = ~sample_id + group_id .. and then you should be able specify coef = "group_id" (of course, as the docs say, you need to make sure that column names match up, etc.).

BingbingYuan commented 5 years ago

Sorry, I still have the problem on creating correct metadata with paired samples. After I changed to S1 and S2, it merged 4 samples into two: two biological replicates were merged together. "sample_id" is reserved by prepSCE for unique sample identifiers. Based on the preSCE description, it takes three ids: "cluster_id, sample_id, group_id" . Is it possible to add another argument to the prepSCE, such as "paire_id"? Thanks,

On Tue, Oct 29, 2019 at 12:09 PM markrobinsonuzh notifications@github.com wrote:

Closed #12 https://github.com/HelenaLC/muscat/issues/12.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/HelenaLC/muscat/issues/12?email_source=notifications&email_token=ACI44XR652DAPXRUWGXWQPTQRBN4DA5CNFSM4JGKJVEKYY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOUQKI7DI#event-2752810893, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACI44XTEC27BKRA4YEN4WSDQRBN4DANCNFSM4JGKJVEA .

-- Bingbing

HelenaLC commented 5 years ago

For prepSim(), cluster_id, sample_id, group_id are required & will be given fixed names. However, if you specify drop = FALSE, any additional cell metadata columns will be kept as well. For example, you could use unique sample identifiers such as sample_id = "sample1_pat1", "sample2_pat1"..., and an additional column patient_id = c("patient1", "patient1"). Both will be kept by perpSim. The important thing is sample_ids are unique.