poseidonchan / TAPE

Deep learning-based tissue compositions and cell-type-specific gene expression analysis with tissue-adaptive autoencoder (TAPE)
https://sctape.readthedocs.io/
GNU General Public License v3.0
47 stars 9 forks source link

Guidance on selecting parameters and choice of data scaling #9

Open dr-michael-haley opened 1 year ago

dr-michael-haley commented 1 year ago

Hi,

I've been using TAPE to deconvolve some cases from the TCGA dataset using reference single-cell sequencing (from 100-200 cells per phenotype, from 6 different phenotypes). It works well, and is very impressive. However, I find I get sometimes very varying results depending on some the parameters.

Is it important that all the cell types expected to be in the bulk data are represented in the reference dataset?

In what situations should StandardScaler or MinMax scaler be used?

Can you offer any advice on selecting a variance_threshold? In some examples you have 0.98, and the default is 0.8. Varying this parameter can strongly impact the proportions, sometimes even if its only altered slightly (e.g +/- 0.05)

I've generated d_priors for my analyses from references, does including them always increase the accuracy?

poseidonchan commented 1 year ago

Thanks for using TAPE!

TAPE does not support deconvolution on datasets where some cell types are missing. Technically, it is impossible to make sure the deconvolution result is plausible if some cell types are missing in bulk or reference datasets.

From my experience, MinMax scaler works well on all tested real data. Standard scaler may work better on some simulated datasets. For variance threshold, the main difference between deep learning method and traditional statistical method is the feature selection step. Both Scaden and TAPE want to just remove low varaince features e.g. features are always zero in a dataset and keep useful information for deconvolution as much as possible. So I think a threshold above 0.9 makes sense. So I set it as 0.98 in the example. I will reset the default parameter in the code.

Yanshuo