tanghaibao / goatools

Python library to handle Gene Ontology (GO) terms
BSD 2-Clause "Simplified" License
782 stars 210 forks source link

Error in making result file #274

Open Rhia15 opened 1 year ago

Rhia15 commented 1 year ago

Hi,

I have been trying to use find_enrichment.py to perform an enrichment analysis. I originally tried it with my study but it did not work so I tried it with the data provided on the github page as well as the command line example but seem to receive the same error message. It doesnt produce a results file either. Is there something else which i need to do before?

(enrichment) [mbxrs14@login001(Augusta) enrichment]$ find_enrichment.py small_study.txt small_population.txt small_association.txt --pval=0.05 --method=fdr_bh --pval_field=fdr_bh --outfile=results_id2gos.xlsx go-basic.obo: fmt(1.2) rel(2023-06-11) 46,420 Terms WARNING: GO:0000229 NOT FOUND IN DAG WARNING: GO:0004871 NOT FOUND IN DAG WARNING: GO:0007568 NOT FOUND IN DAG WARNING: GO:0008565 NOT FOUND IN DAG **WARNING: GO:0051186 NOT FOUND IN DAG HMS:0:00:00.040034 6,267 annotations READ: small_association.txt Study: 38 vs. Population 2000

WARNING: only 0.39473684210526316 fraction of genes/proteins in study are found in the population background.

ERROR: only 0.39473684210526316 of genes/proteins in the study are found in the background population. Please check.

Any help would be greatly appreciated thank you!

tanghaibao commented 1 year ago

@Rhia15

Can you check to see if most of the genes in your study (38 genes) is a subset of the population (2000 genes)? The population can be the entire protein set in the genome of interest.

Rhia15 commented 1 year ago

Hi, thank you for the reply! the input in the example is the txt files I downloaded from this github page in test/data to check if there was anything wrong with the environment or any dependencies. I think it is a subset of the population. It looks like I get the the same warnings about DAG with any data I try though. Could there be something wrong with the DAG? but I have checked the permissions for go-basic.obo and if the IDs are present and everything seems okay, so I am not sure what to do :/

tanghaibao commented 1 year ago

@Rhia15

I just tested with the test files in this repo, it seems to run fine. Have you checked if you have the latest version?

$ find_enrichment.py data/study.txt data/population.txt data/association.txt --pval=0.05 --method=fdr_bh --pval_field=fdr_bh --outfile=results_id2go.xlsx

**WARNING: GO:0000059 NOT FOUND IN DAG
**WARNING: GO:0000060 NOT FOUND IN DAG
...
HMS:0:00:00.263267 127,967 annotations READ: data/association.txt
Study: 276 vs. Population 33239
Load BP Ontology Enrichment Analysis ...
Propagating term counts up: is_a
 87% 28,948 of 33,239 population items found in association

Load CC Ontology Enrichment Analysis ...