scrapinghub / spidermon

Scrapy Extension for monitoring spiders execution.
https://spidermon.readthedocs.io
BSD 3-Clause "New" or "Revised" License
530 stars 96 forks source link

Default email template fails with latest `scrapinghub` client version #429

Open curita opened 8 months ago

curita commented 8 months ago

Issue

data.job.metadata (an instance of scrapinghub.client.jobs:JobMeta) cannot be subscribed. There are multiple places in Spidermon's templates where job meta keys are accessed via data.job.metadata[x]. Those cases fail because of it.

It's unclear when job.metadata stopped being subscribable or if this is a change in the latest Jinja versions (see: https://jinja.palletsprojects.com/en/3.0.x/templates/#variables), but it doesn't seem to work now.

Locally can be partially reproduced via:

In[1]: from scrapinghub import ScrapinghubClient

In[2]: job = ScrapinghubClient(SHUB_APIKEY).get_job(JOB_ID)

In[3]: job.metadata["spider"]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[3], line 1
----> 1 job.metadata["spider"]

TypeError: 'JobMeta' object is not subscriptable

In[4]: from jinja2 import Template

In[5]: Template("{% set is_script = job.metadata['spider'].startswith('py:') %}{{ is_script }}").render(job=job)
---------------------------------------------------------------------------
UndefinedError                            Traceback (most recent call last)
Cell In[5], line 1
----> 1 Template("{% set is_script = job.metadata['spider'].startswith('py:') %}{{ is_script }}").render(job=job)

File ~/src/project/.venv/lib/python3.10/site-packages/jinja2/environment.py:1301, in Template.render(self, *args, **kwargs)
   1299     return self.environment.concat(self.root_render_func(ctx))  # type: ignore
   1300 except Exception:
-> 1301     self.environment.handle_exception()

File ~/src/project/.venv/lib/python3.10/site-packages/jinja2/environment.py:936, in Environment.handle_exception(self, source)
    931 """Exception handling helper.  This is used internally to either raise
    932 rewritten exceptions or return a rendered traceback for the template.
    933 """
    934 from .debug import rewrite_traceback_stack
--> 936 raise rewrite_traceback_stack(source=source)

File <template>:1, in top-level template code()

File ~/src/project/.venv/lib/python3.10/site-packages/jinja2/environment.py:485, in Environment.getattr(self, obj, attribute)
    481 """Get an item or attribute of an object but prefer the attribute.
    482 Unlike :meth:`getitem` the attribute *must* be a string.
    483 """
    484 try:
--> 485     return getattr(obj, attribute)
    486 except AttributeError:
    487     pass

UndefinedError: 'scrapinghub.client.jobs.JobMeta object' has no attribute 'spider'

In [6]: Template("{% set is_script = job.metadata.get('spider').startswith('py:') %}{{ is_script }}").render(job=job)
Out[6]: 'False'

Proposal

Replace all instances of data.job.metadata[x] in the templates with data.job.metadata.get(x).

ptsonev commented 1 month ago

Same exception, it works fine locally, but not when deployed on Zyte's cloud.

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/spidermon/core/actions.py", line 39, in run
    self.run_action()
  File "/usr/local/lib/python3.11/site-packages/spidermon/contrib/actions/email/__init__.py", line 110, in run_action
    message = self.get_message()
              ^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/spidermon/contrib/actions/email/__init__.py", line 143, in get_message
    body_html = self.get_body_html()
                ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/spidermon/contrib/actions/email/__init__.py", line 137, in get_body_html
    html = transform(self.render_template(self.body_html_template))
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/spidermon/contrib/actions/templates.py", line 57, in render_template
    return template.render(self.get_template_context())
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/jinja2/environment.py", line 1301, in render
    self.environment.handle_exception()
  File "/usr/local/lib/python3.11/site-packages/jinja2/environment.py", line 936, in handle_exception
    raise rewrite_traceback_stack(source=source)
  File "/usr/local/lib/python3.11/site-packages/spidermon/contrib/actions/reports/templates/reports/email/monitors/result.jinja", line 81, in top-level template code
    {% macro render_header_data_separator() %}
    ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/spidermon/contrib/actions/reports/templates/reports/email/bases/report/medium.jinja", line 1, in top-level template code
    {% extends 'reports/email/bases/report/base.jinja' %}
    ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/spidermon/contrib/actions/reports/templates/reports/email/bases/report/base.jinja", line 16, in top-level template code
    {% block page_content %}{% endblock %}
    ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/spidermon/contrib/actions/reports/templates/reports/email/monitors/result.jinja", line 136, in block 'page_content'
    {% set is_script = data.job.metadata['spider'].startswith('py:') %}
    ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/jinja2/environment.py", line 485, in getattr
    return getattr(obj, attribute)
           ^^^^^^^^^^^^^^^^^^^^^^^
jinja2.exceptions.UndefinedError: 'scrapinghub.client.jobs.JobMeta object' has no attribute 'spider'