digitalcytometry / ecotyper

EcoTyper is a machine learning framework for large-scale identification of cell states and cellular ecosystems from gene expression data.
Other
184 stars 42 forks source link

Question: Discovery of ecotypes in bulk vs single-cell RNA #57

Closed ccruizm closed 1 year ago

ccruizm commented 1 year ago

Good day,

I really liked the approach implemented in this tool. I have applied to my own single-cell dataset but have some questions I would like to have your input on:

  1. Once I have made the discovery of ecotypes, what is the best way to validate them in bulk RNA? is there a way to directly test the single-cell derived ecotypes on bulk data? can I run Recovery of Cell States and Ecotypes in User-Provided Bulk Data based on the single-cell data? Or do I need to independently run the ecotype discovery in the bulk RNA and then compare them?
  2. I started the ecotype discovery from single-cell data since it should be more robust than performing the deconvolution on bulk RNA. Is this the correct way to do it? Or should I start the exploration of ecotypes on bulk data? Which would be the recommended approach to validate the ecotypes discovered by the tool?
  3. Can I perform/adapt ecotypes recovery in Visium data using the ecotypes obtained from my single-cell data?

Thank you in advance for your help!

BALuca commented 1 year ago

Hi,

Apologize for the delay in replying. Please find answers to your questions:

  1. You can directly run the approach described in Tutorial 1 to recover the ecotypes you derived in single cell data in a bulk datasets.
  2. Both approaches are valid. Cell states and ecotypes derived from single cell data avoid the deconvolution step and provide single cell resolution insights into the TME, but the low sample numbers in a typical scRNA-seq dataset and the potential tissue dissociation artifacts could impact their robustness. On the other hand, discovery based on bulk data can leverage a larger number of samples to discover ecotypes, but it involves an additional sensitive step, the deconvolution. In either case, the best way to assess the validity of your results is to test whether they are recoverable in an external dataset.
  3. Yes, you can follow Tutorial 3 for that. Please let us know if you encounter any issues with this step.

Best, The EcoTyper team