mourisl / T1K

T1K is a versatile methods to genotype highly polymorphic genes (e.g. KIR, HLA) with bulk or single-cell RNA-seq, WGS or WES data.
MIT License
42 stars 7 forks source link

Can FPK be converted to FPKM? #18

Closed lucy-tian closed 10 months ago

lucy-tian commented 10 months ago

Hi, I noticed in your paper that the abundance column is normalized to Fragments per Kilobase (FPK). However, for RNA expression quantification comparison between samples, we will need to normalize by reads to get FPKM. I just wanted to ask if T1K is able to normalize to FPK to FPKM for expression quantification? Thank you!

mourisl commented 10 months ago

FPKM depends on the total read count of your data set, and equals FPK/(read_count/10^6), and the read_count is not stored in T1K. We leave the abundance estimate at FPK level so you can plug in the total read count, by summing the primary alignments from BAM file or counting the rows in fastq files. Some other applications may want to calculate the FPKM with respect to the reads aligned to the KIR/HLA region. Therefore, the FPK value is more flexible.

lucy-tian commented 10 months ago

Thank you!!