Closed eyzhao closed 4 years ago
Hi @eyzhao ,
Please accept my apologies, I never received a notification that this issue was opened and I'm not sure why.
Can you link exactly which GTEx file you downloaded? My starting dataframe for expression didn't come directly from GTEx, but from the UCSC Toil recompute, which likely involved different preprocessing and alignment steps than what GTEx used (at least I didn't see their process on that page).
In the data folder, the GTEx and TCGA data should have values that correspond to: np.log2(TPM + 1)
. I started with the data frames available on Xena, which use log2(TPM + 0.001)
. I transformed those values back to TPM, confirmed they summed to ~1 million, then applied the np.log2(TPM + 1)
transformation.
Please let me know if that does not answer your questions and in the future feel free to email me directly if you do not receive a sufficiently prompt response.
Thank you for developing this useful package. I had two questions
But in your gtex matrix, it is
Perhaps I am misunderstanding something, or the units are not TPM?
Thanks for your time!