Open rashindrie opened 2 years ago
I have another question - Does “TCGA.RNA.Rda” contain FPKM or FPKM-UQ values or something else?
Thanks, Rashindrie
Hi @rashindrie, nice to hear about your progress!
For the input files, you will need a normalized expression matrix, where the columns are sample IDs and the row are gene names, and a meta table where you inform in each row the sample IDs and the corresponding cancer type of it, both files should match sample ID information.
Regarding the normalization, different procedures were applied to TCGA data but it was similar to Combat adjustments for batch effect. I would recommend applying similar procedures to your data and then doing some sanity check to see if the expression profile that you are getting from your genes also follows the same behavior on TCGA, for example, the gene expression for the cancer type that you are adding correlates with the TCGA samples for the same cancer type?
Hi,
Many thanks for your work. I was able to successfully reproduce your original work and now I am interested in applying it on my own dataset. I have extracted 109 count files (XXX.FPKM-UQ.txt) and each file has information in the following format.
expression matrix
from the count files?Thanks, Rashindrie