openzim / zimfarm

Farm operated by bots to grow and harvest new zim files
https://farm.openzim.org
GNU General Public License v3.0
81 stars 25 forks source link

Again \u0000 cannot be converted to text #955

Closed benoit74 closed 3 months ago

benoit74 commented 5 months ago

Problem

https://api.farm.zimit.kiwix.org/v1/tasks/5dff1ad5-d75d-4bee-a94d-502e7cbe2959

2024-04-28 08:16:30.205 ERROR:uwsgi_file__app_main:Exception on /v1/tasks/5dff1ad5-d75d-4bee-a94d-502e7cbe2959 [PATCH]
2024-04-28 08:16:30.205 Traceback (most recent call last):
2024-04-28 08:16:30.205   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1971, in _exec_single_context
2024-04-28 08:16:30.205     self.dialect.do_execute(
2024-04-28 08:16:30.205   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 919, in do_execute
2024-04-28 08:16:30.205     cursor.execute(statement, parameters)
2024-04-28 08:16:30.205   File "/usr/local/lib/python3.8/site-packages/psycopg/cursor.py", line 732, in execute
2024-04-28 08:16:30.205     raise ex.with_traceback(None)
2024-04-28 08:16:30.205 psycopg.errors.UntranslatableCharacter: unsupported Unicode escape sequence
2024-04-28 08:16:30.205 DETAIL:  \u0000 cannot be converted to text.
2024-04-28 08:16:30.205 CONTEXT:  JSON data, line 1: ...tats": {"memory": {"max_usage": null}}, "stderr":...
2024-04-28 08:16:30.205 unnamed portal parameter $5 = '...'
2024-04-28 08:16:30.205 
2024-04-28 08:16:30.205 The above exception was the direct cause of the following exception:
2024-04-28 08:16:30.205 
2024-04-28 08:16:30.205 Traceback (most recent call last):
2024-04-28 08:16:30.205   File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 2190, in wsgi_app
2024-04-28 08:16:30.205     response = self.full_dispatch_request()
2024-04-28 08:16:30.205   File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1486, in full_dispatch_request
2024-04-28 08:16:30.205     rv = self.handle_user_exception(e)
2024-04-28 08:16:30.205   File "/usr/local/lib/python3.8/site-packages/flask_cors/extension.py", line 165, in wrapped_function
2024-04-28 08:16:30.205     return cors_after_request(app.make_response(f(*args, **kwargs)))
2024-04-28 08:16:30.205   File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1484, in full_dispatch_request
2024-04-28 08:16:30.205     rv = self.dispatch_request()
2024-04-28 08:16:30.205   File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1469, in dispatch_request
2024-04-28 08:16:30.205     return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
2024-04-28 08:16:30.205   File "/app/./routes/base.py", line 20, in __call__
2024-04-28 08:16:30.205     return handler(*args, **kwargs)
2024-04-28 08:16:30.205   File "/app/./routes/__init__.py", line 35, in wrapper
2024-04-28 08:16:30.205     return f(*args, **kwargs)
2024-04-28 08:16:30.205   File "/app/./routes/__init__.py", line 53, in wrapper
2024-04-28 08:16:30.205     return f(*args, **kwargs)
2024-04-28 08:16:30.205   File "/app/./routes/__init__.py", line 87, in wrapper
2024-04-28 08:16:30.205     return f(*args, **kwargs)
2024-04-28 08:16:30.205   File "/app/./db/__init__.py", line 50, in inner
2024-04-28 08:16:30.205     return func(*args, **kwargs)
2024-04-28 08:16:30.205   File "/app/./routes/tasks/task.py", line 223, in patch
2024-04-28 08:16:30.205     task_event_handler(
2024-04-28 08:16:30.205   File "/app/./common/utils.py", line 48, in task_event_handler
2024-04-28 08:16:30.205     ret = func(session, task_id, payload)
2024-04-28 08:16:30.205   File "/app/./common/utils.py", line 287, in task_scraper_completed_event_handler
2024-04-28 08:16:30.205     save_event(
2024-04-28 08:16:30.205   File "/app/./common/utils.py", line 158, in save_event
2024-04-28 08:16:30.205     session.flush()  # we have to flush first to avoid circular dependency
2024-04-28 08:16:30.205   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 4296, in flush
2024-04-28 08:16:30.205     self._flush(objects)
2024-04-28 08:16:30.205   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 4432, in _flush
2024-04-28 08:16:30.205     transaction.rollback(_capture_exception=True)
2024-04-28 08:16:30.205   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 146, in __exit__
2024-04-28 08:16:30.205     raise exc_value.with_traceback(exc_tb)
2024-04-28 08:16:30.205   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 4392, in _flush
2024-04-28 08:16:30.205     flush_context.execute()
2024-04-28 08:16:30.205   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/unitofwork.py", line 466, in execute
2024-04-28 08:16:30.205     rec.execute(self)
2024-04-28 08:16:30.205   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/unitofwork.py", line 642, in execute
2024-04-28 08:16:30.205     util.preloaded.orm_persistence.save_obj(
2024-04-28 08:16:30.205   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/persistence.py", line 85, in save_obj
2024-04-28 08:16:30.205     _emit_update_statements(
2024-04-28 08:16:30.205   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/persistence.py", line 912, in _emit_update_statements
2024-04-28 08:16:30.205     c = connection.execute(
2024-04-28 08:16:30.205   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1422, in execute
2024-04-28 08:16:30.205     return meth(
2024-04-28 08:16:30.205   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/sql/elements.py", line 514, in _execute_on_connection
2024-04-28 08:16:30.205     return connection._execute_clauseelement(
2024-04-28 08:16:30.205   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1644, in _execute_clauseelement
2024-04-28 08:16:30.205     ret = self._execute_context(
2024-04-28 08:16:30.205   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1850, in _execute_context
2024-04-28 08:16:30.205     return self._exec_single_context(
2024-04-28 08:16:30.205   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1990, in _exec_single_context
2024-04-28 08:16:30.205     self._handle_dbapi_exception(
2024-04-28 08:16:30.205   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 2357, in _handle_dbapi_exception
2024-04-28 08:16:30.205     raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
2024-04-28 08:16:30.205   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1971, in _exec_single_context
2024-04-28 08:16:30.205     self.dialect.do_execute(
2024-04-28 08:16:30.205   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 919, in do_execute
2024-04-28 08:16:30.205     cursor.execute(statement, parameters)
2024-04-28 08:16:30.205   File "/usr/local/lib/python3.8/site-packages/psycopg/cursor.py", line 732, in execute
2024-04-28 08:16:30.205     raise ex.with_traceback(None)
2024-04-28 08:16:30.205 sqlalchemy.exc.DataError: (psycopg.errors.UntranslatableCharacter) unsupported Unicode escape sequence
2024-04-28 08:16:30.205 DETAIL:  \u0000 cannot be converted to text.
2024-04-28 08:16:30.205 CONTEXT:  JSON data, line 1: ...tats": {"memory": {"max_usage": null}}, "stderr":...
2024-04-28 08:16:30.205 unnamed portal parameter $5 = '...'
2024-04-28 08:16:30.205 [SQL: UPDATE task SET updated_at=%(updated_at)s::TIMESTAMP WITHOUT TIME ZONE, events=%(events)s::JSONB, status=%(status)s::VARCHAR, timestamp=%(timestamp)s::JSONB, container=%(container)s::JSONB WHERE task.id = %(task_id)s::UUID]
2024-04-28 08:16:30.205 [parameters: {'updated_at': datetime.datetime(2024, 4, 28, 6, 16, 30, 178836), 'events': Jsonb([{'code': 'requested', 'timestamp': ... (447 chars)), 'status': 'scraper_completed', 'timestamp': Jsonb({'started': datetime.datetime(2024, ... (332 chars)), 'container': Jsonb({'image': 'ghcr.io/openzim/zimit:de ... (20168 chars)), 'task_id': UUID('5dff1ad5-d75d-4bee-a94d-502e7cbe2959')}]
2024-04-28 08:16:30.205 (Background on this error at: https://sqlalche.me/e/20/9h9h)
2024-04-28 08:16:30.206 [pid: 23|app: 0|req: 32088/231621] 100.64.6.25 () {56 vars in 1344 bytes} [Sun Apr 28 06:16:30 2024] PATCH /v1/tasks/5dff1ad5-d75d-4bee-a94d-502e7cbe2959 => generated 265 bytes in 48 msecs (HTTP/1.1 500) 3 headers in 131 bytes (1 switches on core 0)
2024-04-28 08:16:30.206 100.64.6.25 - - [28/Apr/2024:06:16:30 +0000] "PATCH /v1/tasks/5dff1ad5-d75d-4bee-a94d-502e7cbe2959 HTTP/1.1" 500 265 "-" "python-requests/2.31.0" "51.159.4.131"

Looks like a different occurence than https://github.com/openzim/zimfarm/issues/819 since the task itself seems ok. Probably linked to incoming PATCH payload but it is lost

benoit74 commented 3 months ago

Happens again and again, quite often today