Do you have single cell reference for adipocytes? Can you identify adipocytes within mice muscle fibers?

mckellardw / scMuscle

The Cornell Single-Cell Muscle Project (scMuscle) aims to collect, analyze and provide to the research community skeletal muscle transcriptomic data

18 stars 4 forks source link

Do you have single cell reference for adipocytes? Can you identify adipocytes within mice muscle fibers? #11

Closed guodudou2 closed 7 months ago

guodudou2 commented 8 months ago

Hello,

I am very interested in your publication, but didn't see adipocytes population in your Fig. 1, only FAPs. I am interested to get the proportions of adipocytes and myocytes in skeletal muscle bulk RNAseq data. So I am wondering is there any reason that there is no adipocytes in your paper?

Thanks, Wendy

mckellardw commented 8 months ago

Wendy,

Thanks for your question! We did not find adipocytes in these data, only adipogenic progenitors. This could, however, be an artifact of the integration and clustering strategies we used. Have you checked to see if any adipocyte-related transcripts are detected in these data?

I should also note that adipocytes are likely depleted during the single-cell/single-nucleus dissociation steps, as they tend to float.

guodudou2 commented 8 months ago

Hi David,

Thanks for your response. That makes sense. I have seen one muscle sn-RNAseq data included "Dos Santos et al, Nat Comm, 2020" that has Adipocyte in the original publication, but somehow lost its definite in your combined reference dataset. In the downloaded Seurat object of your paper, there are multiple cell type annotations such as "prefilter_IDs", "prefilter_factorIDs", "harmony_res.1.2_IDs", "harmony_factorIDs", "bbknn_res.1.2_IDs", "bbknn_factorIDs", "scanorama_res.1.2_IDs", "scanorama_factorIDs", I am wondering which annotation should I use? Also, are the count matrix data in the Seurat object after batch correction? Are they ready to be used together for bulk RNAseq deconvolution?

Thank you very much and I look forward to hearing back from you.

Best, Wendy

mckellardw commented 8 months ago

Please note that the names of the clusters are determined by the person doing the analysis, so these could just be semantic differences in how we define "adipocytes". Smaller scale analyses with some of these datasets have been better suited for looking at subtler differnces in cell types (I would point you to look at the Millay Lab's work, as they have generated the highest quality mouse snRNAseq data for muscle - https://doi.org/10.1038/s41467-020-20063-w ).

I typically use the "harmony_factorIDs". If you would like to retain the differences between nuclei and cells, the scanorama or bbknn cell type IDs may be better suited to your analysis. The "prefilter" labels are just labels from clustering performed before quality filtering steps.

I would also encourage you to try clustering or relabeling the dataset on your own if you disagree with my cell type IDs! Feel free to add your analysis as pull request.

guodudou2 commented 8 months ago

Hi David,

Thanks for the detailed information. I am new to single cell analysis and annotation, so not sure how to label cells at this moment, but will definitely try it later. Currently, I am looking for high quality skeletal muscle single cell references with accurate labeling to deconvolute my bulk RNAseq data, so wanted to make sure I get the right dataset and cell types.

Thanks for pointing me to Millay Lab's work, I will check it out. In your publication, seems like batch correction is important to pull all datasets together, I am still wondering whether the count matrix in the downloaded Seurat object are after batch correction? Are they ready to be used as references to deconvolute bulk RNAseq data?

Thanks again and look forward to hearing back from you, Wendy

mckellardw commented 8 months ago

Sorry, I missed that question- no the count matrix was not modified. All of the batch correction algorithms used in these analyses were performed in PCA space, and not directly to the counts.

Yes, the scMuscle dataset should serve as an effective reference for deconvolution of bulk data. There are many effective algorithms/tools to accomplish this- I recommend you start with BayesPrism (https://github.com/Danko-Lab/BayesPrism), the tool we used in this study. The author have put together excellent suggestions on how to accomplish this- please read their README and vignette for more details.

guodudou2 commented 8 months ago

Thanks for your recommendation! I was interested to use BayesPrim too. When I search for single cell references, I found your paper and was very excited because you collected and reannotated mouse skeletal muscle related sc- & sn- references and used the same deconvolution method that I plan to use. Most importantly, you have your final Seurat object publicly available and are responding to the questions in a timely manner in Github. I have contacted multiple other groups for their final Seurat object with cell type annotation, but never heard back from them. So I really appreciate your effort and help!

I have another question about using the sc- / sn- references for deconvolution. As indicated in the meta part of your Seurat object, these cells / nucleus were from different mice strains under different experimental conditions such as age, are we able to use all cells / nucleus of a certain type for deconvolution? If not, what is your suggestion for selecting? Thanks!

mckellardw commented 8 months ago

Great question! I did not test the differences in cell type deconvolution with or without certain mouse strains. I would be very interested in your results! Perhaps start with the cells/nuclei that most closely match your samples? Good luck!

guodudou2 commented 7 months ago

It is good to know. Thanks, David!