HelenaLC / muscat

Multi-sample multi-group scRNA-seq analysis tools
160 stars 32 forks source link

A constraint that might not be required in PrepSim #113

Open onahman opened 1 year ago

onahman commented 1 year ago

I am looking at those lines of code from the file PrepSim: if (!is.null(min_size)) { n_cells <- table(x$cluster_id, x$sample_id) n_cells <- .filter_matrix(n_cells, n = min_size) if (ncol(n_cells) == 1) stop("Current 'min_size' retains only 1 sample,\nbut", " mean-dispersion estimation requires at least 2.") if (verbose) message(sprintf("- %s/%s subpopulations and %s/%s samples retained.", nrow(n_cells), nlevels(x$cluster_id), ncol(n_cells), nlevels(x$sample_id))) x <- .filter_sce(x, rownames(n_cells), colnames(n_cells)) }

I think the if (ncol(n_cells) == 1) might not be required, but a check on each row, that there are still more than 1 cell after filtration. The estimation did work even with only one dataset. Am I missing something? I am trying to avoid batch effects, and would prefer using only one dataset and not multiple if possible.

HelenaLC commented 1 year ago

Hm. The motivation here is that muscat is designed for multi-sample multi-group data, so multiple samples (replicates) are expected. Having only one sample will not allow any of the differential testing functions to work (as there'll be only one sample per group). May I ask what the long-game is here? I.e., are you only interested in the simulation perhaps?

onahman commented 1 year ago

Hi, Thank you for your reply! I'll explain. I wish my generated data to have several cell types and have two different samples / groups. But my reference data doesn't have replicates. Rather, the different samples can't be treated as identical (different cancer patients, and cancer is known to be very heterogeneous. I did some batch correction prior the simulation, but the genes that should not be DE have pretty different means - I wish to avoid it / reduce the noise, therefore my decision to use one sample). I am generating the data and later I am interested in testing DE genes but not via your provided framework. So you could say I am mainly interested in the simulation, but I do expect the genes to be differentially expressed. Do you still think I will encounter a problem? I am not sure I completely understand what deferential testing functions will no longer work. Only the ones in your package or in general? Thanks again for your help Ornit

On Fri, Dec 16, 2022 at 2:19 PM Helena L. Crowell @.***> wrote:

Hm. The motivation here is that muscat is designed for multi-sample multi-group data, so multiple samples (replicates) are expected. Having only one sample will not allow any of the differential testing functions to work (as there'll be only one sample per group). May I ask what the long-game is here? I.e., are you only interested in the simulation perhaps?

— Reply to this email directly, view it on GitHub https://github.com/HelenaLC/muscat/issues/113#issuecomment-1354681526, or unsubscribe https://github.com/notifications/unsubscribe-auth/AG6BSDNVXULQMAX2F2G4JMTWNRM5BANCNFSM6AAAAAASNPKJEQ . You are receiving this because you authored the thread.Message ID: @.***>