Closed excel9 closed 3 years ago
Dear Shayoni,
This happens because one of your inputs is either contained NA
or it is NOT in numeric
format.
Please be sure that you check the following, before you run fpkm()
:
1- counts, featureLength, meanFragmentLength are free of NA. 2- counts should be a numeric matrix, featureLength and meanFragmentLength are numeric vectors.
Thank you Ahmed! It worked now after I removed the "NA" columns. I had a quick question, why did I get FPKM values of 17462 genes (74 genes lower) than the annotation file with 17536 genes.
Thank you so much for your help!
Hi there
For accurate quantification of FPKM of RNA-Seq data, the read counts need to be normalised by feature effective length Lee et al. 2011 paper. To compute the effective length, the meanFragmentLength will be deducted from the feature length. Thus, the features lengthened less than the meanFragmentLength will be automatically dropped off. In other word, you cannot calculate the fpkm for features smaller than the meanFragmentLength, and that is why your fpkm_matrix is shorter than counts.
To get stats about the genes that drop off due to featureLength < meanFragmentLength Please try to use the latest version from Github
if(!require(devtools)) install.packages("devtools") devtools::install_github("AAlhendi1707/countToFPKM", build_vignettes = TRUE)
Hope it helps! A
Hi AAlhendi1707,
I created the gene.annotations file (mouse ensemble mm10) with filtered and re-ordered gene.annotations to match the order in counts matrix and then ran this code (below) for FPKM. Unfortunately the fpkm_matrix output was NA all through.
library(countToFPKM)
Import the read count matrix data into R.
counts <- read.csv(file = 'normalized_count.csv', header = TRUE) rownames(counts) <- counts[, 1] counts <- counts[, -1]
Import feature annotations and Assign feature length into a numeric vector.
gene.annotations <- read.csv("gene.annotations.csv", header=TRUE) featureLength <- gene.annotations$length
Import sample metrics and Assign mean fragment length into a numeric vector.
samples.metrics <- read.delim ("RNAseq.samples.metrics.txt", sep="\t", header=TRUE) meanFragmentLength <- samples.metrics$meanFragmentLength
Return FPKM into a numeric matrix.
fpkm_matrix <- fpkm(as.matrix(counts), featureLength, meanFragmentLength)
I am also uploading the counts and gene.annotations file (originally as .csv file, but uploaded here as .txt which is supported by github) along with the samples.metrics.txt. gene.annotations.txt normalized_count.txt RNAseq.samples.metrics.txt
Please help me out in this!
Thanks, excel9