reanahub / reana-workflow-engine-snakemake

REANA Workflow Engine Snakemake
MIT License
0 stars 21 forks source link

report: offline-friendly report generation #54

Open tiborsimko opened 1 year ago

tiborsimko commented 1 year ago

Observed locally.

A Snakemake demo workflow and the corresponding batch pod finish successfully, but when creating the final report, there is an active fetching of style assets from CDN going on, which fails due to probably transient Kind Docker network bridge troubles:

$ reana-client logs -w root6-roofit-snakemake-kubernetes
...
2023-01-13 08:17:27,973 | snakemake.logging | MainThread | WARNING | Creating report...
2023-01-13 08:18:30,836 | snakemake.logging | MainThread | WARNING | Downloading resources and rendering HTML.
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/urllib3/connection.py", line 174, in _new_conn
    conn = connection.create_connection(
  File "/usr/local/lib/python3.8/dist-packages/urllib3/util/connection.py", line 72, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/usr/lib/python3.8/socket.py", line 918, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
  File "/usr/local/lib/python3.8/dist-packages/urllib3/connectionpool.py", line 386, in _make_request
    self._validate_conn(conn)
  File "/usr/local/lib/python3.8/dist-packages/urllib3/connectionpool.py", line 1042, in _validate_conn
    conn.connect()
  File "/usr/local/lib/python3.8/dist-packages/urllib3/connection.py", line 358, in connect
    self.sock = conn = self._new_conn()
  File "/usr/local/lib/python3.8/dist-packages/urllib3/connection.py", line 186, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f758940cca0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/requests/adapters.py", line 489, in send
    resp = conn.urlopen(
  File "/usr/local/lib/python3.8/dist-packages/urllib3/connectionpool.py", line 787, in urlopen
    retries = retries.increment(
  File "/usr/local/lib/python3.8/dist-packages/urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='stackpath.bootstrapcdn.com', port=443): Max retries exceeded with url: /bootstrap/4.1.1/css/bootstrap.min.css (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f758940cca0>: Failed to establish a new connection: [Errno -3] Tempo>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/snakemake/__init__.py", line 699, in snakemake
    success = workflow.execute(
  File "/usr/local/lib/python3.8/dist-packages/snakemake/workflow.py", line 875, in execute
    auto_report(dag, report, stylesheet=report_stylesheet)
  File "/usr/local/lib/python3.8/dist-packages/snakemake/report/__init__.py", line 871, in auto_report
    rendered = template.render(
  File "/usr/local/lib/python3.8/dist-packages/jinja2/environment.py", line 1301, in render
    self.environment.handle_exception()
  File "/usr/local/lib/python3.8/dist-packages/jinja2/environment.py", line 936, in handle_exception
    raise rewrite_traceback_stack(source=source)
  File "/usr/local/lib/python3.8/dist-packages/snakemake/report/report.html.jinja2", line 12, in top-level template code
    <style>{{ "https://stackpath.bootstrapcdn.com/bootstrap/4.1.1/css/bootstrap.min.css"|get_resource_as_string }}</style>
  File "/usr/local/lib/python3.8/dist-packages/snakemake/report/__init__.py", line 588, in get_resource_as_string
    r = requests.get(url)
  File "/usr/local/lib/python3.8/dist-packages/requests/api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/requests/sessions.py", line 587, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.8/dist-packages/requests/sessions.py", line 701, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/requests/adapters.py", line 565, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='stackpath.bootstrapcdn.com', port=443): Max retries exceeded with url: /bootstrap/4.1.1/css/bootstrap.min.css (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f758940cca0>: Failed to establish a new connection: [Errno -3] Te>
2023-01-13 08:18:30,871 | reana-workflow-engine-snakemake | MainThread | ERROR | Error generating workflow HTML report.
2023-01-13 08:18:30,873 | root | MainThread | INFO | Workflow aec63c81-2b31-4f17-bc0e-de12f30d6436 finished. Files available at /var/reana/users/00000000-0000-0000-0000-000000000000/workflows/aec63c81-2b31-4f17-bc0e-de12f30d6436.

It may be good to investigate whether we could make the report generation to be more "offline-friendly", i.e. not access CDN during report generation, only during report viewing.

This could solve possible problems for those REANA installations where worker nodes do not have access to all the Internet.

Let's look at what options the Snakemake report generation package offers. If we can influence it easily, then good. Otherwise this issue can sleep, since most installations wouldn't suffer from this issue. (E.g. we don't have to go into hosting assets locally, that would be going too far.)

tiborsimko commented 1 year ago

I can reproduce this relatively easily by pausing the cluster (docker pause kind-control-plane), suspend/resume the laptop for several days, then bringing the cluster back up (docker unpause kind-control-plane). The waking it up leads to the above troubles.

giuseppe-steduto commented 10 months ago

Note that Snakemake has an ongoing discussion about the possibility to generate the report offline by compiling all the online resources into the Snakemake package - apparently there's no easy way to generate it offline at the moment.