Open gk-bioin4m8x opened 7 years ago
Hello, It would be very helpful if anyone has done pathway analysis with GAGE and Pathview using Sleuth results.
Thanks!
ENST-to-Reactome mappings (transcript level) are rather sparse (usually only the canonical isoform is annotated) so it may actually be hurting you to operate at the tx level. We found this out the direct way by running both. If your gene sets are very sparse, most all methods are going to suffer due to lack of structure.
--t
On Mon, Jan 23, 2017 at 3:44 AM, GK notifications@github.com wrote:
Hello, It would be very helpful if anyone has done pathway analysis with GAGE and Pathview using Sleuth results.
Thanks!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/pachterlab/sleuth/issues/91#issuecomment-274467601, or mute the thread https://github.com/notifications/unsubscribe-auth/AAARIp5MsRVYfDYrV5KJFBO2GOdVgSNfks5rVJKQgaJpZM4LYI1s .
Hi, Any updates on it or do anyone have a workflow how to do Pathway analysis after sleuth or sleuth_live?
I did the below steps to get transcript mapping to gene ids which were based on this tutorial
diamond makedb --in nr.faa -d nr -p 8
diamond blastx -d nr -q Trinity.fasta -o nr.blastx.out -p 8 -k 1 --outfmt 6
echo -e "target_id\tens_gene" > t2g.txt
cut -f 1-2 nr.blastx.out >> t2g.txt
The below R code was based on:
#!/usr/bin/env Rscript
library("sleuth")
sample_id <- dir("kallisto")
sample_id
kal_dirs <- sapply(sample_id, function(id) file.path("kallisto", id))
kal_dirs
s2c <- read.table("!{exp_file}", header = TRUE, stringsAsFactors=FALSE)
s2c <- dplyr::select(s2c, sample = ids, Tissue_type, condition)
s2c <- s2c[order(s2c$sample), ]
s2c <- dplyr::mutate(s2c, path = kal_dirs)
print(s2c)
t2g <- read.table("!{t2g_file}", header = TRUE, stringsAsFactors=FALSE)
t2g
# Load the kallisto processed data into the object
so <- sleuth_prep(s2c, ~ condition, target_mapping = t2g, extra_bootstrap_summary = TRUE)
# Estimate parameters for the sleuth response error measurement (full) model
so <- sleuth_fit(so)
#Next, we fit the reduced model. In this case, the reduced model is the intercept-only model
so <- sleuth_fit(so, ~1, 'reduced')
print("Done sletu_fit")
#and finally perform the test:
so <- sleuth_lrt(so, 'reduced', 'full')
print("DONE sleut_lrt")
models(so)
save(so, file=paste("sleuth_object.so"))
Can I use sleuth_live or do I have to add in the above code the following line?
so <- sleuth_wt(so, 'conditionscramble')
gene_table <- sleuth_gene_table(so, test = "conditionscramble", test_type = "wt")
write.table(gene_table, paste("gene_table_results.txt"), sep="\t")
How do I know which test should I use?
Thank you in advance,
Michal
Hello,
Did anyone try pathway analysis with GAGE and Pathview using Sleuth results (
results_table.csv
where column names are: target_id, test_stat, pval, qval, rss, sigma_sq, tech_var, mean_obs, var_obs, sigma_sq_max, smooth_sigma_sq, final_sigma_sq, degree_free, ens_gene, ext_gene)? I am following this tutorial https://www.r-bloggers.com/tutorial-rna-seq-differential-expression-pathway-analysis-with-sailfish-deseq2-gage-and-pathview/ but I have three samples with replicates (and more in future). I am able to get the pathways but not clear to which sample they belong to and also unable to get the figures showing up and down-regulated genes in the pathway. I think I need to troubleshoot the input for many samples with replicates. My code:# Retrieving rows with qval < 0.05 from results_table.
>lrt_results.e <- dplyr::filter(results_table, qval < 0.05)
# Adding entrez column to Sleuth results for pathway analysis (where Rrr= species (using hypothetical name here)).
>lrt_results.e$entrez <- mapIds(org.Rrr.eg.db, keys=lrt_results.e$target_id, column="ENTREZID", keytype="ENSEMBLTRANS", multiVals="first")
# Retrieving rows where value of entrez is not NA
>lrt_results.e.noNA <- subset(lrt_results.e, !is.na(entrez))
>qval.e <- lrt_results.e.noNA$qval
>names(qval.e) <- lrt_results.e.noNA$entrez
# KEGG pathway gene sets for Rrr
>kegg_Rrr <- kegg.gsets("Rrr")
# sigmet.idx is an index of numbers of signaling and metabolic pathways in kegg.gsets.
>kegg_Rrr_sigmet <- kegg_Rrr$kg.sets[kegg_Rrr$sigmet.idx]
# Using gage package'
>keggres.e <- gage(qval.e, gsets=kegg_Rrr_sigmet, same.dir=TRUE)>keggrespathways <- data.frame(id=rownames(keggres.e$greater), keggres.e$greater) %>% tbl_df() %>% filter(row_number()<=3) %>% .$id %>% as.character()
>keggrespathways
>keggresids <- substr(keggrespathways, start=1, stop=8)
>keggresids
>plot_pathway <- function(pid) pathview(gene.data=qval.e, pathway.id=pid, species="Rrr", new.signature=FALSE)
>pmp <- sapply(keggresids, function(pid) pathview(gene.data=qval.e, pathway.id=pid, species="Rrr"))
>pmp. <- sapply(keggresids, function(pid) pathview(gene.data=lrt_results.e.noNA, pathway.id=pid, species="Rrr"))
Kindly guide regarding the input.
Thanks!