satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.25k stars 904 forks source link

Unclear instructions for integration with SCTransform-ed data (SeuratV5) #8458

Open jasonleongbio opened 7 months ago

jasonleongbio commented 7 months ago

I've been struggling with the recommended procedure to perform multi-sample integration with a SCTransform-ed dataset. I found that the recommended instructions (i.e. the code) are scattered across multiple pages, and often the code on different pages is confusing. I feel that it would be better to put them together clearly in the manual.

<1> I was first following the steps in the upper part on the ["Integrative analysis in Seurat v5" page](https://satijalab.org/seurat/articles/seurat5_integration) (as I was moving from SeuratV4 to V5), and everything seemed just fine at first. I didn't pay much attention to the remaining part on the page, because I could proceed to some other subsequent analyses (that I am interested in but do not rely on the SeuratObject). ```{r} exp_data[["RNA"]] <- split(exp_data[["RNA"]], f = exp_data$orig.ident) exp_data <- SCTransform(exp_data, vst.flavor = "v2") exp_data <- RunPCA(object = exp_data) exp_data <- IntegrateLayers( object = exp_data, method = HarmonyIntegration, orig.reduction = "pca", new.reduction = "harmony", verbose = TRUE ) ``` However, as I was wondering what the `JoinLayers()` function does (related to problem no.2 in this issue), I saw that in the last part on the instruction page, integration with SCTransform-ed data appears to require an additional option `normalization.method` inside the `IntegrateLayers()` function: ``` # the code on the instruction page obj <- IntegrateLayers( object = obj, method = RPCAIntegration, normalization.method = "SCT", verbose = F ) ``` This is a bit confusing to the users because we would have to scroll down to the very last part of the page and be careful enough to spot this trivial difference in the code. Instead, I suggest the code for SCTransform-ed data should be clear enough on the manual (or perhaps should be put in a separate section), especially on this page (because the code for SCTransform-ed data appears to be slightly different, such as also having to implement a `PrepSCTFindMarkers()` step, which was not mentioned on this page at all). Then a few days afterwards, as I was checking the other pages on the website, I realized that there is actually a different page called the ["Intro to scRNA-seq integration"](https://satijalab.org/seurat/articles/integration_introduction), where the code for properly handling SCTransform-ed data is documented much better than the ["Integrative analysis in Seurat v5" page](https://satijalab.org/seurat/articles/seurat5_integration). Still, the trivial differences in the code (such as having to set the option inside `IntegrateLayers()`) are not highlighted, which would be difficult for users to follow. In addition, perhaps the "Integrative analysis in Seurat v5" page should redirect users to the "Introduction" page for the section dedicated to the handling of the SCTransform-ed data. The current manual is just confusing. Anyway, I tried both _with_ and _without_ `normalization.method = "SCT"` inside the `IntegrateLayers()` function. (I am sorry for not being able to provide a reproducible example here because I only tried with my unpublished data). The results do not differ a lot, but differences do exist. For example, how many cells from each sample and which cells are allocated into each cluster appear to be slightly different. <2> The `JoinLayers()` function on all the manual pages link to a 404 page. It is difficult for users to easily understand what are actually performed, why it is needed, and what should be set with the code. (I do understand what it is trying to do after digging through many related pages but the manual can simply include one page to explain what it does for clarity with some examples). In addition, on some pages, the function appears to replace `seurat_obj[["RNA"]]` while on some other pages, it replaces the entire `seurat_obj`. It is difficult to judge which should be the recommended code to use for the standard workflow without a clear manual page. Thank you so much in advance! I really love the new v5 but the integration methods appear to be quite different from the previous version and I'm struggling with the current version of the manual.
GischD commented 7 months ago

Hi @jasonleongbio and @seurat team,

I also faced issues in my code because I felt the lack of information in the manual. I fully understand the challenge involved in developing a package as complex and advanced as Seurat. The arrival of version 5, although a step forward, has introduced some uncertainties that naturally require time to clarify. I thought it might be beneficial to explore the idea of creating a collaborative manual. This approach would allow users to contribute solutions that they have documented themselves, which could significantly speed up the process of enriching the package documentation.

If the lab could curate these contributions before publishing them, it would guarantee the quality and relevance of the information shared. Furthermore, in the long term, we could even consider compiling these contributions into a collaborative book. Each contribution could become a chapter with a DOI, acknowledging and giving due credit to the users who contributed. This would strengthen the community around Seurat and encourage more active and engaged participation from everyone.

Thank you, Debora

jasonleongbio commented 7 months ago

Hi @GischD ,

I strongly agree that community contribution may help a lot in exploring the possibilities and limitations of the current version of Seurat, especially as there are so many tools for single-cell analyses nowadays and some (or perhaps many) rely on the SeuratObejct structure. However, I'm not sure how the Seurat team thinks about that, because the maintenance of the quality of the community-contributed tutorials may require additional effort (e.g. by decentralized peer review?)

In my case, if I simply look at the current manual, I cannot judge why SCTransform + without specifying normalization.method inside IntegrateLayers() may be problematic.

haukeh90 commented 7 months ago

Dear Seurat Team,

Thank you very much for the release of Seurat V5. It has been immensely helpful in our integration efforts, bringing many quality of life changes. However, I must echo the sentiments of the two previous contributors - the documentation and instructions for one-line integration of SCTransformed data are not entirely clear.

While the workflow outlined in the vignette is comprehensible, we are encountering difficulties in downstream integration with scvi of SCTransformed data (all other one-line Integration methods work).

The code produces the following error message:

object <- IntegrateLayers(
  assay = "SCT",
  object = object,
  method = scVIIntegration,
  orig.reduction = "pca",
  new.reduction = "integrated.scvi",
  normalization.method = "SCT",
  verbose = T, conda_env = "~/miniforge3/envs/scvi_final"
)

Error in UseMethod(generic = "JoinLayers", object = object) : 
  no applicable method for 'JoinLayers' applied to an object of class "c('SCTAssay', 'Assay', 'KeyMixin')"

It appears that one of the initial steps in the SCVI-based integration involves the JoinLayers function, which cannot be applied to the SCT assay. This may be because the SCT assay is not structured like the RNA5 assay (does not contain count and data layers split by Batch Key).

In addition we are facing related issues regarding the incompatibility of the SCT assay and the new sketch-based integrative analysis (already mentioned in #8428).

It would be fantastic if these issues, along with the ones mentioned earlier (clarification of normalization.method), could be addressed in a more comprehensive documentation or a dedicated vignette for SCT-based (sketch) integration with the different one-line integration methods.

Thank you for your assistance and for providing such a great package.

Best,

Hauke

Plasmiddddd commented 7 months ago

Dear Seurat Team,

I run into the same problem when trying to use JoinLayers function in the SCT layer. Here is the code I'm using: object[["SCT"]] <- JoinLayers(object[["SCT"]])

Here is the Error message: Error in UseMethod(generic = "JoinLayers", object = object) : no applicable method for 'JoinLayers' applied to an object of class "c('SCTAssay', 'Assay', 'KeyMixin')"

Thank you!