Open cagenet opened 3 years ago
Hi Carine! Quick note on the way you call the function - it should be enough to use the gtl object, as the res/res_enrich/annotation are all picked from that. Re: the missing values problem - does the error persist if you
Hi Federico, Thank you so much for your quick reply. First, I didn't catch what you mean by "the way I call the function ". I follow your user guide (step 5.1). For the missing values problem : I think that there is two issues : the first one is that for the pig specie there is no mapping between Ensembl and symbol in org.Ss.eg.db (see https://support.bioconductor.org/p/9137574/)... So I used bovine species (I have 2 datasets I applied a treatment in bovine and in pig and want to compare the results obtained in the two species). Now with the bovine species, the enhance_table work with symbol but not with ENSEMBL ids. I do have a conversion handled internally in the annotation object (a two table file with gene_id and gene_symbol (call HGNC in my code) and I have in the RowData in dds object an annotation with Ensemblid and gene symbol (call HGNC in my code see below). What I don't understand is that I have an conversion table in the annotation object and also in the RowData (dds).
Thank you for your help. Cheers
Here is my code :
library(GeneTonic) load(file="DESEQ2_dds1_Bov.rda")#dds2
dds2 class: DESeqDataSet dim: 15779 20 metadata(1): version assays(4): counts mu H cooks rownames(15779): ENSBTAG00000000005 ENSBTAG00000000009 ... ENSBTAG00000055312 ENSBTAG00000055315 rowData names(60): ENSEMBLID HGNC ... deviance maxCooks colnames(20): BE1_CTRL BE2_CTRL ... BE14_BMP15 BE16_BMP15 colData names(3): Cow Treatment sizeFactor
load(file="resBta_04juin.rda")
summary(resBta) out of 15779 with nonzero total read count adjusted p-value < 0.05 LFC > 0 (up) : 2323, 15% LFC < 0 (down) : 2460, 16% outliers [1] : 0, 0% low counts [2] : 306, 1.9% (mean count < 2) [1] see 'cooksCutoff' argument of ?results [2] see 'independentFiltering' argument of ?results
resannot_Bov<-read.table(file="annot_bov.txt", header = TRUE, sep="\t")
head(resannot_Bov) gene_id gene_name 1 ENSBTAG00000000005 GRK3 2 ENSBTAG00000000008 KCNJ1 3 ENSBTAG00000000009 FOXF1 4 ENSBTAG00000000010 UBL7 5 ENSBTAG00000000011 TDH 6 ENSBTAG00000000012 TTC33
de_symbol_bov <- deseqresult2df(resBta, FDR = 0.05)$SYMBOL bg_bov_sym <- rowData(dds2)$HGNC[rowSums(counts(dds2)) > 0]
resBta$ensemblid<-(row.names(resBta)) bg_ensembl_bov <- rowData(dds2)$ENSEMBLID[rowSums(counts(dds2)) > 0] de_ensembl_bov <- deseqresult2df(resBta, FDR = 0.05)$ensemblid library("topGO") topgoBP_bov <- pcaExplorer::topGOtable(de_symbol_bov, bg_bov_sym, ontology = "BP", mapping = "org.Bt.eg.db", geneID = "symbol", topTablerows = 500)
topgoBPensembl_bov <- pcaExplorer::topGOtable(de_ensembl_bov, bg_ensembl_bov, ontology = "BP", mapping = "org.Bt.eg.db", geneID = "ENSEMBL", topTablerows = 500) res_enrich_bov<-shake_topGOtableResult(topgoBP_bov) res_enrich_bov_ensembl<-shake_topGOtableResult(topgoBPensembl_bov) res_enrich_bov <- get_aggrscores(res_enrich = res_enrich_bov, res_de = resBta, annotation_obj = resannot_Bov, aggrfun = mean) res_enrich_bov_ensembl<-get_aggrscores(res_enrich = res_enrich_bov_ensembl, res_de = resBta,annotation_obj = resannot_Bov,aggrfun = mean) gtl_bov <- GeneTonic_list( dds = dds2, res_de = resBta, res_enrich = res_enrich_bov, annotation_obj = resannot_Bov )
gtl_bov_ensembl<-GeneTonic_list(dds=dds2, res_de = resBta, res_enrich = res_enrich_bov_ensembl, annotation_obj = resannot_Bov) p <- enhance_table(res_enrich_bov, resBta, n_gs = 30, gtl<-gtl_bov, annotation_obj = resannot_Bov, chars_limit = 60) p1<-enhance_table(res_enrich_bov_ensembl, resBta, n_gs=30, gtl = gtl_bov_ensembl, annotation_obj = resannot_Bov, chars_limit = 60) Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : les noms de lignes contiennent des valeurs manquantes
Hi Federico, Thank you so much for your quick reply. First, I didn't catch what you mean by "the way I call the function ". I follow your user guide (step 5.1).
Hi Carine, in theory you can call
enhance_table(gtl = gtl_bov_ensembl, n_gs=30, chars_limit = 60)
as gtl should have all it needs 😉
For the missing values problem : I think that there is two issues : the first one is that for the pig specie there is no mapping between Ensembl and symbol in org.Ss.eg.db (see https://support.bioconductor.org/p/9137574/)... So I used bovine species (I have 2 datasets I applied a treatment in bovine and in pig and want to compare the results obtained in the two species). Now with the bovine species, the enhance_table work with symbol but not with ENSEMBL ids. I do have a conversion handled internally in the annotation object (a two table file with gene_id and gene_symbol (call HGNC in my code) and I have in the RowData in dds object an annotation with Ensemblid and gene symbol (call HGNC in my code see below). What I don't understand is that I have an conversion table in the annotation object and also in the RowData (dds).
Thank you for your help. Cheers
Here is my code :
library(GeneTonic) load(file="DESEQ2_dds1_Bov.rda")#dds2
dds2 class: DESeqDataSet dim: 15779 20 metadata(1): version assays(4): counts mu H cooks rownames(15779): ENSBTAG00000000005 ENSBTAG00000000009 ... ENSBTAG00000055312 ENSBTAG00000055315 rowData names(60): ENSEMBLID HGNC ... deviance maxCooks colnames(20): BE1_CTRL BE2_CTRL ... BE14_BMP15 BE16_BMP15 colData names(3): Cow Treatment sizeFactor
load(file="resBta_04juin.rda")
summary(resBta) out of 15779 with nonzero total read count adjusted p-value < 0.05 LFC > 0 (up) : 2323, 15% LFC < 0 (down) : 2460, 16% outliers [1] : 0, 0% low counts [2] : 306, 1.9% (mean count < 2) [1] see 'cooksCutoff' argument of ?results [2] see 'independentFiltering' argument of ?results
resannot_Bov<-read.table(file="annot_bov.txt", header = TRUE, sep="\t")
head(resannot_Bov) gene_id gene_name 1 ENSBTAG00000000005 GRK3 2 ENSBTAG00000000008 KCNJ1 3 ENSBTAG00000000009 FOXF1 4 ENSBTAG00000000010 UBL7 5 ENSBTAG00000000011 TDH 6 ENSBTAG00000000012 TTC33
de_symbol_bov <- deseqresult2df(resBta, FDR = 0.05)$SYMBOL bg_bov_sym <- rowData(dds2)$HGNC[rowSums(counts(dds2)) > 0]
resBta$ensemblid<-(row.names(resBta)) bg_ensembl_bov <- rowData(dds2)$ENSEMBLID[rowSums(counts(dds2)) > 0] de_ensembl_bov <- deseqresult2df(resBta, FDR = 0.05)$ensemblid library("topGO") topgoBP_bov <- pcaExplorer::topGOtable(de_symbol_bov, bg_bov_sym, ontology = "BP", mapping = "org.Bt.eg.db", geneID = "symbol", topTablerows = 500)
Ok, now that you elaborate a little more on this (and since there's no ensembl->symbol conversion) it can be that the gs_genes
column in the res_enrich
...
topgoBPensembl_bov <- pcaExplorer::topGOtable(de_ensembl_bov, bg_ensembl_bov, ontology = "BP", mapping = "org.Bt.eg.db", geneID = "ENSEMBL", topTablerows = 500) res_enrich_bov<-shake_topGOtableResult(topgoBP_bov) res_enrich_bov_ensembl<-shake_topGOtableResult(topgoBPensembl_bov)
... here ☝️ does not have the gene symbols "as expected".
I am not so familiar with species other than human, mouse & fly, as these are the almost entirety of the catalogue our cooperation partners work with.
If you want to share with me the gtl
object, I can have a look (will of course be handled confidentially).
Federico
res_enrich_bov <- get_aggrscores(res_enrich = res_enrich_bov, res_de = resBta, annotation_obj = resannot_Bov, aggrfun = mean) res_enrich_bov_ensembl<-get_aggrscores(res_enrich = res_enrich_bov_ensembl, res_de = resBta,annotation_obj = resannot_Bov,aggrfun = mean) gtl_bov <- GeneTonic_list( dds = dds2, res_de = resBta, res_enrich = res_enrich_bov, annotation_obj = resannot_Bov )
gtl_bov_ensembl<-GeneTonic_list(dds=dds2, res_de = resBta, res_enrich = res_enrich_bov_ensembl, annotation_obj = resannot_Bov) p <- enhance_table(res_enrich_bov, resBta, n_gs = 30, gtl<-gtl_bov, annotation_obj = resannot_Bov, chars_limit = 60) p1<-enhance_table(res_enrich_bov_ensembl, resBta, n_gs=30, gtl = gtl_bov_ensembl, annotation_obj = resannot_Bov, chars_limit = 60) Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : les noms de lignes contiennent des valeurs manquantes
Hi ;-) First issue: I guess that it was something like that BUT I still obtained the same error ... p<-enhance_table(gtl=gtl_bov_ensembl, n_gs = 30, chars_limit = 60) Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : les noms de lignes contiennent des valeurs manquantes
Second issue : Thank you so much for your help ... Of course I can send you my gtl object ... Will try to import and give you access to gtf object in my repository (GitHub newbie and never used it !) Thank you
Sure, no prob. You can reach me at the email address in the maintainer field of the package (trying to make bots' work to scrape my email a little harder, sorry for the extra step 😛 )
Hi Federico, Thank you for this really nice and helpful package. It is really useful. I used it for agronomical species (pig and bovine). I encountered a problem when trying to use enhance_table function. I have the following error :