churchmanlab / genewalk

GeneWalk identifies relevant gene functions for a biological context using network representation learning
https://churchman.med.harvard.edu/genewalk
BSD 2-Clause "Simplified" License
127 stars 14 forks source link

Trouble Using GeneWalk #44

Closed SinjaFan closed 3 years ago

SinjaFan commented 3 years ago

Hello Churchman Lab Team,

I have recently tried to implement your module for an enrichment analysis on genes I got from a differential gene expression analysis. However, an error keeps on recurring and I am unsure of what the problem is.

My Python version is 3.8.6 which should be able to run GeneWalk. I also installed the module with no errors. The following lines show up when I try to run the module

$ genewalk --project PMS --genes PMSUpGenesOnly.txt --id_type hgnc_symbol
INFO: [2021-03-01 13:54:37] genewalk.cli - Creating PMS folder at /Users/sinjiafan/genewalk/PMS
INFO: [2021-03-01 13:54:37] genewalk.resources - Using /Users/sinjiafan/genewalk/resources as resource folder.
INFO: [2021-03-01 13:54:37] genewalk.resources - Downloading https://www.genenames.org/cgi-bin/download/custom?col=gd_hgnc_id&col=gd_app_sym&col=gd_app_name&col=gd_prev_sym&col=gd_status&col=md_eg_id&col=md_prot_id&col=md_mgd_id&col=md_rgd_id&col=gd_pub_ensembl_id&status=Approved&status=Entry%20Withdrawn&hgnc_dbtag=on&order_by=gd_app_sym_sort&format=text&submit=submit into /Users/sinjiafan/genewalk/resources/hgnc_entries.tsv
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 1350, in do_open
    h.request(req.get_method(), req.selector, req.data, headers,
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 1255, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 1301, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 1250, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 1010, in _send_output
    self.send(msg)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 950, in send
    self.connect()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 1424, in connect
    self.sock = self._context.wrap_socket(self.sock,
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/ssl.py", line 500, in wrap_socket
    return self.sslsocket_class._create(
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/ssl.py", line 1040, in _create
    self.do_handshake()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/ssl.py", line 1309, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1124)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.8/bin/genewalk", line 8, in <module>
    sys.exit(main())
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/genewalk/cli.py", line 157, in main
    run_main(args)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/genewalk/cli.py", line 195, in run_main
    genes = read_gene_list(args.genes, args.id_type, rm)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/genewalk/gene_lists.py", line 31, in read_gene_list
    gene_mapper = GeneMapper(resource_manager)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/genewalk/gene_lists.py", line 232, in __init__
    self.hgnc_file = self.resource_manager.get_hgnc()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/genewalk/resources.py", line 78, in get_hgnc
    download_url(url, fname)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/genewalk/resources.py", line 125, in download_url
    urllib.request.urlretrieve(url, fname)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 247, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 525, in open
    response = self._open(req, data)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 542, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 1393, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 1353, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1124)>

I have attached the gene list I am trying to run as well.

PMSUpGenesOnly.txt

I hope you can help me resolve this issue. Thank you so much in advance!

Best, Sinja (Xuanjia) Fan

ri23 commented 3 years ago

Hi Sinja,

It seems your SSL certificate is self-signed which causes problems with downloading some of the resource files. I cannot give advice on your SSL certificate settings since that is not a problem caused by GeneWalk , but if you cannot change the SSL settings:

I would recommend to first check the file hgnc_entries.tsv is not yet located in your into /Users/sinjiafan/genewalk/resources/ folder. Then download the following file manually from the link below and add it into to the resource folder.

https://www.genenames.org/cgi-bin/download/custom?col=gd_hgnc_id&col=gd_app_sym&col=gd_app_name&col=gd_prev_sym&col=gd_status&col=md_eg_id&col=md_prot_id&col=md_mgd_id&col=md_rgd_id&col=gd_pub_ensembl_id&status=Approved&status=Entry%20Withdrawn&hgnc_dbtag=on&order_by=gd_app_sym_sort&format=text&submit=submit

You can then run python -m genewalk.resources This should download the other files and prepare them for GeneWalk

See if the other file downloads also give similar SSL certificate errors and manually download them based on the error log you get.

Once all the downloaded files are present in the resource folder, you should be able to rerun GeneWalk with the first command you used: genewalk --project PMS --genes PMSUpGenesOnly.txt --id_type hgnc_symbol

SinjaFan commented 3 years ago

Hi!

Your solution of downloading the .tsv file worked! I downloaded the rest of the resources without a problem and the program is running as expected. Thank you very much for your help!

Best, Sinja