poseidonchan / TAPE

Deep learning-based tissue compositions and cell-type-specific gene expression analysis with tissue-adaptive autoencoder (TAPE)
https://sctape.readthedocs.io/
GNU General Public License v3.0
47 stars 9 forks source link

TranscriptLength is easier to get comparing to transcript start and end #11

Open loganylchen opened 1 year ago

loganylchen commented 1 year ago

No need to calculate the gene len if the file already provided.

poseidonchan commented 1 year ago

Hi Logan,

Thanks for your commit, but I am considering removing these function, calculating the TPM or FPKM seems useless?

Yanshuo

loganylchen commented 1 year ago

Hi Yanshuo,

I don't have any solid evidence for the necessity of converting counts to TPM or FPKM. But I saw many other deconvolution tools accept raw counts as their first priority input format.

Considering the pseudo-bulk differential expressed gene identification in general single-cell analysis, I think it should be OK, or reasonable to treat the count matrix from both single-cell and bulk RNA-seq following the same distribution. Maybe we could have some tests if the predictions perform better with/without transforming.

Logan