doulijun777 commented 1 year ago

Thanks for this wonderful tool. I have one question: when I simulated Psedobulk data using sc-RNAseq data, I just check the function "generate_simulated_data", it looks as

print('Normalizing raw single cell data with scanpy.pp.normalize_total') sc_data = anndata.AnnData(sc_data)

sc.pp.normalize_total(sc_data, target_sum=1e4)

So, do we need to normalize here or not? I am little confused?

Another question is that for bulk data, do we need to change to TPM or FPKM or only use the count data.

Thank you.

doulijun777 commented 1 year ago

In the published code, this sentence was commented out, actually. so I am little confused.

poseidonchan commented 1 year ago

Hi doulijun777:

Thanks for trying TAPE. Actually, I am not very sure about the normalization problem right now. Probably I should not commentated it out. For the bulk data, whatever the normalization is, please use "count" argument in the function to make sure the proper deconvolution performance.

Regards, Yanshuo

poseidonchan / TAPE

data normalization #12

sc.pp.normalize_total(sc_data, target_sum=1e4)