jhuckaby / Cronicle

A simple, distributed task scheduler and runner with a web based UI.
http://cronicle.net
Other
3.86k stars 386 forks source link

How can I configure in the cronicle that If my python code breaks dont put "Job Completed Successfully" but "Job Failed" or something? #639

Closed juanfrilla closed 1 year ago

juanfrilla commented 1 year ago

Summary

I have some python web-scraping scripts scheduled and it only appear Error in the column of last Run if the job runs out of memory or takes longer than expected. But not if my python code breaks, if my python code breaks, the full error trace appear in the logs but it's not marked as error but it mark success.

Steps to reproduce the problem

Schedule a python script that breaks and it dont mark as Error but as Success

Your Setup

Operating system and version?

CentOS Linux 7 (Core)

Node.js version?

v16.18.1

Cronicle software version?

0.9.27

Are you using a multi-server setup, or just a single server?

Single server

Are you using the filesystem as back-end storage, or S3/Couchbase?

Can you reproduce the crash consistently?

Log Excerpts

Here my code breaks and in cronicle says that the job completed successfully:

2023-09-17 17:42:48 [scrapy.core.scraper] ERROR: Spider error processing <GET https://www.example.com/> (referer: None)
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/twisted/internet/defer.py", line 824, in adapt
    extracted = result.result()
  File "/home/tirant/.local/lib/python3.8/site-packages/scrapy/core/spidermw.py", line 298, in process_callback_output
    return await self._process_callback_output(response, spider, result)
  File "/home/tirant/.local/lib/python3.8/site-packages/scrapy/core/spidermw.py", line 278, in _process_callback_output
    result = await maybe_deferred_to_future(
  File "/usr/local/lib/python3.8/site-packages/twisted/internet/defer.py", line 1416, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/usr/local/lib/python3.8/site-packages/twisted/python/failure.py", line 512, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/home/tirant/.local/lib/python3.8/site-packages/scrapy/core/spidermw.py", line 231, in _process_spider_output
    result = yield deferred_from_coro(collect_asyncgen(result))
  File "/usr/local/lib/python3.8/site-packages/twisted/internet/defer.py", line 824, in adapt
    extracted = result.result()
  File "/home/tirant/.local/lib/python3.8/site-packages/scrapy/utils/asyncgen.py", line 6, in collect_asyncgen
    async for x in result:
  File "/home/tirant/.local/lib/python3.8/site-packages/scrapy/core/spidermw.py", line 118, in process_async
    async for r in iterable:
  File "/home/tirant/git/modulos/lib/modulos/pipelines/executor.py", line 23, in _process_pipelines
    async for item in parse(spider, *args, **kargs):
  File "/home/tirant/git/modulos/scrapy/latam/jurisprudenciapy3/jurisprudenciapy3/spiders/jPeruIndecopi.py", line 487, in parse
    j_id = self.extract_jid(soup)
  File "/home/tirant/git/modulos/scrapy/latam/jurisprudenciapy3/jurisprudenciapy3/spiders/jPeruIndecopi.py", line 266, in extract_jid
    j_id = soup.select(
IndexError: list index out of range
2023-09-17 17:42:48 [scrapy.core.engine] INFO: Closing spider (finished)
2023-09-17 17:42:48 [modulos.middlewares.middlewares] DEBUG: close_spider
2023-09-17 17:42:48 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 264,
 'downloader/request_count': 1,
 'downloader/request_method_count/GET': 1,
 'downloader/response_bytes': 1005,
 'downloader/response_count': 1,
 'downloader/response_status_count/200': 1,
 'elapsed_time_seconds': 1359.507475,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2023, 9, 17, 15, 42, 48, 451787),
 'httpcompression/response_bytes': 1256,
 'httpcompression/response_count': 1,
 'log_count/DEBUG': 16,
 'log_count/ERROR': 1,
 'log_count/INFO': 28,
 'log_count/WARNING': 2,
 'memusage/max': 281133056,
 'memusage/startup': 281133056,
 'response_received_count': 1,
 'scheduler/dequeued': 1,
 'scheduler/dequeued/memory': 1,
 'scheduler/enqueued': 1,
 'scheduler/enqueued/memory': 1,
 'spider_exceptions/IndexError': 1,
 'start_time': datetime.datetime(2023, 9, 17, 15, 20, 8, 944312)}
2023-09-17 17:42:48 [scrapy.core.engine] INFO: Spider closed (finished)
[2023/09/17 17:42:48] --- TRABAJANDO EN SERVIDOR ---
mrkcmo commented 1 year ago

I was wondering the same thing. I have a python script in which I call sys.exit() in specific conditions which would be failures. But the Cronicle job thinks it is successful as it doesn't pick up the break/sys.exit as an error. @jhuckaby is there something we need to send to the shell script plugin to signify an error failure so it's picked up by the job?

jhuckaby commented 1 year ago

If you are using a custom Plugin, then you have to tell Cronicle if the job succeeded or failed using JSON. Instructions here:

https://github.com/jhuckaby/Cronicle/blob/master/docs/Plugins.md#json-output

However, if you are using the built-in Shell Plugin, then the last command's exit code dictates the job success/fail status.

https://github.com/jhuckaby/Cronicle/blob/master/docs/Plugins.md#built-in-shell-plugin

mrkcmo commented 1 year ago

If you are using a custom Plugin, then you have to tell Cronicle if the job succeeded or failed using JSON. Instructions here:

https://github.com/jhuckaby/Cronicle/blob/master/docs/Plugins.md#json-output

However, if you are using the built-in Shell Plugin, then the last command's exit code dictates the job success/fail status.

https://github.com/jhuckaby/Cronicle/blob/master/docs/Plugins.md#built-in-shell-plugin

Thanks @jhuckaby that really helps. I could find that documentation when searching. So as long as the script exits with a status of not "0" then it should be seen as a failure. So @juanfrilla when you exit just exit with sys.exit(1) or some custom message that doesn't translate to "0".

juanfrilla commented 1 year ago

Thanks @jhuckaby , you answer was really useful for me. And thanks @mrkcmo, i'll try to do that.