Seurat::PrepSCTFindMarkers

ktrns commented 6 months ago

This additional function is necessary if SCTransform was used, otherwise we won't find any markers. We need to read a bit more on this. For now, I added the following code, just above RunPrestoAll:

if (grepl(pattern="_SCT$", x=default_assay, perl=TRUE)) {
  sc = Seurat::PrepSCTFindMarkers(object = sc, assay="RNA_SCT")
}

@andpet0101 will implement this into the module dimensionality_reduction.

ktrns commented 1 week ago

After module 4 (dimensionality reduction), AP wants:

1 Seurat object
with each 1 slot counts, data, scale.data (instead of per dataset as before)

This is already done. BUT for FindMarkers we need to correct the counts:

Seurat::PrepSCTFindMarkers corrects the counts:
"The counts slot of the SCT assay is replaced with recorrected counts and the data slot is replaced with log1p of recorrected counts." from here

We decide to leave this function before RunPrestoAll, but document it better in the code.

andpet0101 commented 1 week ago

I added the following line to the code in commit 3d4f34f85c3273a72f0a2743ac1013e415819271:

# Prepare SCT for marker detection
if (grepl(pattern="_SCT$", x=default_assay, perl=TRUE)) sc = Seurat::PrepSCTFindMarkers(object=sc, assay=default_assay)

to run the function in case SCT was used.

I will add some more documentation in a next commit and then close the issue.

andpet0101 commented 1 week ago

I found this discussion here https://github.com/satijalab/seurat/issues/6675 which suggests that we move PrepSCTFindMarkers to the integration or the normalization module. Basically they say that the results of sctransform are only valid if all datasets have the same sequencing depth. If not then, this needs to be corrected and this is done in PrepSCTFindMarkers.

Now why only at this step? I checked the source code of Seurat::RunPCA and indeed you can find something like this:

    if (verbose) {
        message(paste0("Found ", length(x = levels(x = object[[assay]])), 
            " SCT models.", " Recorrecting SCT counts using minimum median counts: ", 
            min_median_umi))
    }

So sometimes they do the correction in the functions that use SCTransform results.

My suggestion would be to add to the end of the integration module because I assume that some of the integration functions still work on the different datasets.

Agree?

ktrns commented 1 week ago

Suggestion KS: try once directly after SCTransform and once at the end of the module - do the results differ?

dcgc-bfx / scrnaseq2

Seurat::PrepSCTFindMarkers #37