Open guidohooiveld opened 7 months ago
@guidohooiveld thanks and please test with the github version.
@GuangchuangYu : thanks for having a look at this so quickly; much appreciated.
After updating it works fine; thanks!
I have also edited my answer on the Bioconductor support forum to include a link to this thread.
> BiocManager::install(c('YuLab-SMU/GOSemSim'), force=TRUE)
Bioconductor version 3.18 (BiocManager 1.30.22), R 4.3.0 (2023-04-21 ucrt)
Installing github package(s) 'YuLab-SMU/GOSemSim'
Downloading GitHub repo YuLab-SMU/GOSemSim@HEAD
<<snip>>
>
>
> library(clusterProfiler)
> packageVersion("GOSemSim")
[1] ‘2.29.1.1’
>
> Pa_GO <- read.csv("gene_ontology_csv.csv")
> Pa_GOterms <- Pa_GO[c(5,1)]
>
> ## check;
> ## note that order of columns now aligns with those as in TERM2GENE,
> ## and that names did NOT have to be changed!
> colnames(Pa_GOterms)
[1] "Accession" "Locus.Tag"
> dim(Pa_GOterms)
[1] 15883 2
> head(Pa_GOterms)
Accession Locus.Tag
1 GO:0005524 PA0001
2 GO:0006270 PA0001
3 GO:0006275 PA0001
4 GO:0016887 PA0001
5 GO:0016887 PA0001
6 GO:0006260 PA0001
> tail(Pa_GOterms)
Accession Locus.Tag
15878 GO:0008033 PA5569
15879 GO:0001682 PA5569
15880 GO:0004526 PA5569
15881 GO:0003735 PA5570
15882 GO:0005840 PA5570
15883 GO:0006412 PA5570
>
> Pa_GOMap <- buildGOmap(Pa_GOterms)
> ## check; note list is longer and 'tail' showes additional GO IDs.
> dim(Pa_GOMap)
[1] 119221 2
> head(Pa_GOMap)
Accession Locus.Tag
1 GO:0005524 PA0001
2 GO:0006270 PA0001
3 GO:0006275 PA0001
4 GO:0016887 PA0001
5 GO:0016887 PA0001
6 GO:0006260 PA0001
> tail(Pa_GOMap)
Accession Locus.Tag
183988 GO:0044249 PA5570
183989 GO:0044271 PA5570
183990 GO:0071704 PA5570
183991 GO:1901564 PA5570
183992 GO:1901566 PA5570
183993 GO:1901576 PA5570
>
>
@GuangchuangYu , @huerqiang
At the Bioconductor support forum an issue/error was reported regarding the function
buildGOmap
: https://support.bioconductor.org/p/9156358/I had a quick look at it, and it seems this is because a) the required input for
buildGOmap
seems to be counter-intuitive, and b)buildGOmap
explicitly expects as input a column labelled "GO".Regarding a): The required input for
buildGOmap
(1st column should be the geneids, 2nd column the GOIDs) seems to be counter-intuitive because for the generic enrichment functionsenricher
andGSEA
the reverse order is rather required for the inputTERM2GENE
(thus 1st column the GOIDs, and 2nd column the geneids)... Maybe good to align this, or at least explain it better at the help page? Also add an example on the help page?Regarding b): The function
buildGOmap_internal
has hard-coded the requirement that the column with the GOIDs should be labelledGO
: https://github.com/YuLab-SMU/GOSemSim/blob/1800f404145ac9788685db07f3d9ad6c70f65cc3/R/buildGOmap.R#L41and
https://github.com/YuLab-SMU/GOSemSim/blob/1800f404145ac9788685db07f3d9ad6c70f65cc3/R/buildGOmap.R#L45
This requirement is not stated at the help page of
buildGOmap
, so could you please add that? It now results in the reported error.Thanks, G
FWIW: I have attached the GO mapping file that was used on the Bioconductor support forum, and thus gave issues. It was downloaded from https://www.pseudomonas.com/goterms/list (as
csv
).gene_ontology_csv.csv