mourisl / T1K

T1K is a versatile methods to genotype highly polymorphic genes (e.g. KIR, HLA) with bulk or single-cell RNA-seq, WGS or WES data.
MIT License
52 stars 8 forks source link

Can FPK be converted to FPKM? #18

Closed lucy-tian closed 1 year ago

lucy-tian commented 1 year ago

Hi, I noticed in your paper that the abundance column is normalized to Fragments per Kilobase (FPK). However, for RNA expression quantification comparison between samples, we will need to normalize by reads to get FPKM. I just wanted to ask if T1K is able to normalize to FPK to FPKM for expression quantification? Thank you!

mourisl commented 1 year ago

FPKM depends on the total read count of your data set, and equals FPK/(read_count/10^6), and the read_count is not stored in T1K. We leave the abundance estimate at FPK level so you can plug in the total read count, by summing the primary alignments from BAM file or counting the rows in fastq files. Some other applications may want to calculate the FPKM with respect to the reads aligned to the KIR/HLA region. Therefore, the FPK value is more flexible.

lucy-tian commented 1 year ago

Thank you!!