Closed najibveto closed 2 years ago
Did you check the objects created and put as input to the makeOrgPackage function? In general duplicate rows in R data frames are not allowed. Perhaps it worked correctly for a different species because that species did not have duplicate rows?
thank for your reply.
I check the table gene2go and gene2ko and I didn't find duplicate, as you can see here:
previously, I used the package for making the database and was in similar form:
and it worked fine. instead of using gene name, I used the transcript name in the GID column.
hi @najibveto
Please format your code as follows using the markdown formatting in Github issue . It is not easy to follow your question.
And you'll have to provide fatheadminnow-annotation.tsv for us to reproduce
sorry for my late reply. for the problem, I used the transcript id instead of the gene name and it works fine now. thank you for your help.
hello, I am working on non-model organism. so i tried the make the organism package through use of function makeOrgPackage as fellow: first i annotated the different genes of my organism using eggnogmapper, then i loaded the generated table into rstudio and used the following code: rm(list = ls()) options(stringsAsFactors = F) library(tidyverse) library(clusterProfiler) library(AnnotationHub) library(AnnotationForge) egg <- rio::import('fatheadminnow-annotation.tsv') egg[egg==""] <- NA colnames(egg) gene_info <- egg %>% dplyr::select(GID = query_name, GENENAME = seed_ortholog) %>% na.omit() gterms <- egg %>% dplyr::select(query_name, GOs) %>% na.omit() gterms<- gterms[!grepl("-", gterms$GOs),] library(stringr) all_go_list=str_split(gterms$GOs,",") gene2go <- data.frame(GID = rep(gterms$query_name, times = sapply(all_go_list, length)), GO = unlist(all_go_list), EVIDENCE = "IEA") gene2go<- gene2go[!grepl("-", gene2go$GO),] gene2ko <- egg %>% dplyr::select(GID = query_name, KO = KEGG_ko) %>% na.omit() load("kegg_info.RData") colnames(ko2pathway)=c("KO",'Pathway') library(stringr) gene2ko$KO=str_replace(gene2ko$KO,"ko:","") gene2ko<- gene2ko[!grepl("-", gene2ko$KO),] gene2pathway <- gene2ko %>% left_join(ko2pathway, by = "KO") %>% dplyr::select(GID, Pathway) %>% na.omit() makeOrgPackage(gene_info=gene_info, go=gene2go, ko=gene2ko, maintainer='gmail.com>', author='gmail.com>', pathway=gene2pathway, version="0.0.1", outputDir = "C:/Users/Documents", tax_id=90988, genus="Pimephales", species="promelas", goTable="go") and i got the following error: Error in FUN(X[[i]], ...) : data.frames in '...' cannot contain duplicated rows
i already used the package before and used the same code for another specie and it worked fine.