rivasiker / autocoalhmm

Automated pipeline for running CoalHMM
11 stars 2 forks source link

autocoalhmm on HPC #2

Closed cafecate closed 10 months ago

cafecate commented 1 year ago

Hi, @rivasiker I try autocoalhmm pipeline on my HPC cluster and get those error messages:

Traceback (most recent call last): File "/gpfs/home/CF/miniconda3/envs/autocoalhmm/lib/python3.7/urllib/request.py", line 1317, in do_open encode_chunked=req.has_header('Transfer-encoding')) File "/gpfs/home/CF/miniconda3/envs/autocoalhmm/lib/python3.7/http/client.py", line 1229, in request self._send_request(method, url, body, headers, encode_chunked) File "/gpfs/home/CF/miniconda3/envs/autocoalhmm/lib/python3.7/http/client.py", line 1275, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/gpfs/home/CF/miniconda3/envs/autocoalhmm/lib/python3.7/http/client.py", line 1224, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/gpfs/home/CF/miniconda3/envs/autocoalhmm/lib/python3.7/http/client.py", line 1016, in _send_output self.send(msg) File "/gpfs/home/CF/miniconda3/envs/autocoalhmm/lib/python3.7/http/client.py", line 956, in send self.connect() File "/gpfs/home/CF/miniconda3/envs/autocoalhmm/lib/python3.7/http/client.py", line 1384, in connect super().connect() File "/gpfs/home/CF/miniconda3/envs/autocoalhmm/lib/python3.7/http/client.py", line 928, in connect (self.host,self.port), self.timeout, self.source_address) File "/gpfs/home/CF/miniconda3/envs/autocoalhmm/lib/python3.7/socket.py", line 707, in create_connection for res in getaddrinfo(host, port, 0, SOCK_STREAM): File "/gpfs/home/CF/miniconda3/envs/autocoalhmm/lib/python3.7/socket.py", line 748, in getaddrinfo for res in _socket.getaddrinfo(host, port, family, type, proto, flags): socket.gaierror: [Errno -2] Name or service not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/gpfs/home/CF/miniconda3/envs/autocoalhmm/bin/gwf", line 8, in sys.exit(main()) File "/gpfs/home/CF/miniconda3/envs/autocoalhmm/lib/python3.7/site-packages/click/core.py", line 1128, in call return self.main(args, kwargs) File "/gpfs/home/CF/miniconda3/envs/autocoalhmm/lib/python3.7/site-packages/click/core.py", line 1053, in main rv = self.invoke(ctx) File "/gpfs/home/CF/miniconda3/envs/autocoalhmm/lib/python3.7/site-packages/click/core.py", line 1656, in invoke super().invoke(ctx) File "/gpfs/home/CF/miniconda3/envs/autocoalhmm/lib/python3.7/site-packages/click/core.py", line 1395, in invoke return ctx.invoke(self.callback, ctx.params) File "/gpfs/home/CF/miniconda3/envs/autocoalhmm/lib/python3.7/site-packages/click/core.py", line 754, in invoke return __callback(args, *kwargs) File "/gpfs/home/CF/miniconda3/envs/autocoalhmm/lib/python3.7/site-packages/click/decorators.py", line 26, in new_func return f(get_current_context(), args, *kwargs) File "/gpfs/home/CF/miniconda3/envs/autocoalhmm/lib/python3.7/site-packages/gwf/cli.py", line 129, in main latest_version = get_latest_version() File "/gpfs/home/CF/miniconda3/envs/autocoalhmm/lib/python3.7/site-packages/gwf/utils.py", line 203, in get_latest_version with urlopen(UPDATE_CHECK_URL, timeout=1) as resp: File "/gpfs/home/CF/miniconda3/envs/autocoalhmm/lib/python3.7/urllib/request.py", line 222, in urlopen return opener.open(url, data, timeout) File "/gpfs/home/CF/miniconda3/envs/autocoalhmm/lib/python3.7/urllib/request.py", line 525, in open response = self._open(req, data) File "/gpfs/home/CF/miniconda3/envs/autocoalhmm/lib/python3.7/urllib/request.py", line 543, in _open '_open', req) File "/gpfs/home/CF/miniconda3/envs/autocoalhmm/lib/python3.7/urllib/request.py", line 503, in _call_chain result = func(args) File "/gpfs/home/CF/miniconda3/envs/autocoalhmm/lib/python3.7/urllib/request.py", line 1360, in https_open context=self._context, check_hostname=self._check_hostname) File "/gpfs/home/CF/miniconda3/envs/autocoalhmm/lib/python3.7/urllib/request.py", line 1319, in do_open raise URLError(err) urllib.error.URLError: <urlopen error [Errno -2] Name or service not known> Error: Local backend could not connect to workers on port 12345. Workers can be started by running "gwf workers". You can read more in the documentation: https://gwf.app/reference/backends/#gwf.backends.local.LocalBackend

I don't know how to resolve this problem. Could you help me or give me some advises?

rivasiker commented 1 year ago

Hi, @cafecate! What kind of queue management system does your cluster have?

cafecate commented 1 year ago

Hi! @rivasiker The cluster use torque management system, and I use "qsub" command to submit the task

rivasiker commented 1 year ago

I think your issue is that you need to configure the backend of the gwf workflow. I run all of my workflows on slurm, so I must run gwf config set backend slurm the first time I use the gwf workflow. I am not familiar with torque, but I suspect you should configure the backend similarly. I think there is a gwf pluggin that lets you do so (https://github.com/micknudsen/gwf-torque-backend), but maybe the creator of the pluggin (@micknudsen) can help you better than I can?

micknudsen commented 1 year ago

@cafecate When you try to run gwf "out of the box", it defaults to a simulated local backend, which one has to configure first. I have never really experimented with that, since I'm always working on an HPC.

As suggested by @rivasiker, you could try my plugin for enabling torque support in gwf. I have limited experience with torque, but I recently found myself in a position where I had to develop a pipeline on a torque system, and I desperately wanted to use gwf, so I made the plugin. It works on that specific HPC, and am curious to know if it works on others, too.

cafecate commented 1 year ago

@cafecate When you try to run gwf "out of the box", it defaults to a simulated local backend, which one has to configure first. I have never really experimented with that, since I'm always working on an HPC.

As suggested by @rivasiker, you could try my plugin for enabling torque support in gwf. I have limited experience with torque, but I recently found myself in a position where I had to develop a pipeline on a torque system, and I desperately wanted to use gwf, so I made the plugin. It works on that specific HPC, and am curious to know if it works on others, too.

Hi !@micknudsen; Thanks for help! I install gwf-torque-backend by command "conda install -c micknudsen gwf-torque-backend" and it installed well However ,when I type "gwf" , it reports an error: File "/gpfs/home/CF/miniconda3/envs/autocoalhmm/bin/gwf", line 7, in from gwf.cli import main File "/gpfs/home/CF/miniconda3/envs/autocoalhmm/lib/python3.7/site-packages/gwf/cli.py", line 11, in from .backends import guess_backend, list_backends File "/gpfs/home/CF/miniconda3/envs/autocoalhmm/lib/python3.7/site-packages/gwf/backends/init.py", line 1, in from .base import Backend, Status ImportError: cannot import name 'Backend' from 'gwf.backends.base' (/gpfs/home/CF/miniconda3/envs/autocoalhmm/lib/python3.7/site-packages/gwf/backends/base.py )

Did I do something wrong?

micknudsen commented 1 year ago

@cafecate Hmmm. It could be an issue with the version of gwf. I developed gwf-torque-backend to work with the version 1.7.3. Some things have changed in newer versions, which e.g. broke my other plugin, gwf-utilization.

Which version of gwf are you using?

cafecate commented 1 year ago

@micknudsen I'm using gwf v2.0.2 and I didn't find the old version of gwf on the internet. Only v2.0.2, v2.0.1 v2.0.0 and v1.8.5 are released on "https://github.com/gwforg/gwf/ " and gwf v2.0.2 on anaconda. When I use conda to install "gwf-torque-backend" , gwf v2.0.2 would be installed

micknudsen commented 1 year ago

@cafecate Earlier versions of gwf are available in the gwforg conda channel. This works for me:

conda create -c gwforg -c micknudsen -n foo gwf==1.7.2 gwf-torque-backend

The version 1.7.3 in my previous post was a mistake. I meant 1.7.2.