Tested on https://ecolex.edw.ro using
./manage.py import_xml --input_url https://ecolex.edw.ro/static/faolex_202107.zip
It ran for about 1h and ended abruptly with an unhandled exception:
...
File "/home/web/ecolex/ecolex/legislation.py", line 278, in add_legislations
doc.save()
...
django.db.utils.OperationalError: (2006, 'MySQL server has gone away')
legislation_import.log file wasn't really useful, I can't tell which records were processed. Strangely, there are no "INFO" lines in legislation_import.log (only DEBUG messages), but there are a lot of messages in django_errors.log (looks like django.db.backends outputs everything
-rw-r--r-- 1 root root 546K Dec 5 16:12 legislation_import.log
-rw-r--r-- 1 root root 2.8G Dec 5 17:04 django_errors.log
LEX-FAOC050711 is indeed a very large PDF file (3146 pages).
But even when there are such errors, they must be caught and the import command need to continue.
See also a few other comments in the PR.
And please improve the logging (add more context to those messages, so they are meaningful even after the import has finished).
Full stack trace:
Traceback (most recent call last):
File "./manage.py", line 10, in
execute_from_command_line(sys.argv)
File "/usr/local/lib/python3.6/site-packages/django/core/management/init.py", line 353, in execute_from_command_line
utility.execute()
File "/usr/local/lib/python3.6/site-packages/django/core/management/init.py", line 345, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/usr/local/lib/python3.6/site-packages/django/core/management/base.py", line 348, in run_from_argv
self.execute(*args, *cmd_options)
File "/usr/local/lib/python3.6/site-packages/django/core/management/base.py", line 399, in execute
output = self.handle(args, **options)
File "/home/web/ecolex/ecolex/management/commands/import_xml.py", line 24, in handle
response = harvest_file(legislation_file.read())
File "/home/web/ecolex/ecolex/legislation.py", line 228, in harvest_file
add_legislations(legislations, count_ignored)
File "/home/web/ecolex/ecolex/legislation.py", line 278, in add_legislations
doc.save()
File "/usr/local/lib/python3.6/site-packages/django/db/models/base.py", line 708, in save
force_update=force_update, update_fields=update_fields)
File "/usr/local/lib/python3.6/site-packages/django/db/models/base.py", line 736, in save_base
updated = self._save_table(raw, cls, force_insert, force_update, using, update_fields)
File "/usr/local/lib/python3.6/site-packages/django/db/models/base.py", line 801, in _save_table
forced_update)
File "/usr/local/lib/python3.6/site-packages/django/db/models/base.py", line 851, in _do_update
return filtered._update(values) > 0
File "/usr/local/lib/python3.6/site-packages/django/db/models/query.py", line 645, in _update
return query.get_compiler(self.db).execute_sql(CURSOR)
File "/usr/local/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 1149, in execute_sql
cursor = super(SQLUpdateCompiler, self).execute_sql(result_type)
File "/usr/local/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 848, in execute_sql
cursor.execute(sql, params)
File "/usr/local/lib/python3.6/site-packages/django/db/backends/utils.py", line 79, in execute
return super(CursorDebugWrapper, self).execute(sql, params)
File "/usr/local/lib/python3.6/site-packages/sentry_sdk/integrations/django/init.py", line 500, in execute
return real_execute(self, sql, params)
File "/usr/local/lib/python3.6/site-packages/django/db/backends/utils.py", line 64, in execute
return self.cursor.execute(sql, params)
File "/usr/local/lib/python3.6/site-packages/django/db/utils.py", line 95, in exit
six.reraise(dj_exc_type, dj_exc_value, traceback)
File "/usr/local/lib/python3.6/site-packages/django/utils/six.py", line 685, in reraise
raise value.with_traceback(tb)
File "/usr/local/lib/python3.6/site-packages/django/db/backends/utils.py", line 64, in execute
return self.cursor.execute(sql, params)
File "/usr/local/lib/python3.6/site-packages/django/db/backends/mysql/base.py", line 112, in execute
return self.cursor.execute(query, args)
File "/usr/local/lib/python3.6/site-packages/MySQLdb/cursors.py", line 255, in execute
self.errorhandler(self, exc, value)
File "/usr/local/lib/python3.6/site-packages/MySQLdb/connections.py", line 50, in defaulterrorhandler
raise errorvalue
File "/usr/local/lib/python3.6/site-packages/MySQLdb/cursors.py", line 252, in execute
res = self._query(query)
File "/usr/local/lib/python3.6/site-packages/MySQLdb/cursors.py", line 378, in _query
db.query(q)
File "/usr/local/lib/python3.6/site-packages/MySQLdb/connections.py", line 280, in query
_mysql.connection.query(self, query)
django.db.utils.OperationalError: (2006, 'MySQL server has gone away')
Tested on https://ecolex.edw.ro using
./manage.py import_xml --input_url https://ecolex.edw.ro/static/faolex_202107.zip
It ran for about 1h and ended abruptly with an unhandled exception:
(see full stack trace below)
The overall number of legislation records hasn't changed (https://ecolex.edw.ro/result/?type=legislation), so I can't tell if anything has been updated or not.
legislation_import.log file wasn't really useful, I can't tell which records were processed. Strangely, there are no "INFO" lines in
legislation_import.log
(only DEBUG messages), but there are a lot of messages indjango_errors.log
(looks like django.db.backends outputs everythingLast rows in django_errors were ``` [05/Dec/2021 17:04:30] DEBUG [django.db.backends:89] (0.001) None; args=('LEX-FAOC039005', 'legislation', 'http://extwprlegs1.fao.org/docs/pdf/eur39005.pdf', '<?xml version="1.0" encoding="UTF-8"?>\n< [05/Dec/2021 17:04:30] DEBUG [django.db.backends:89] (0.006) None; args=('LEX-FAOC042993', '2021-12-05 15:12:35.169401') [05/Dec/2021 17:04:30] DEBUG [django.db.backends:89] (0.001) None; args=('LEX-FAOC042993', 'legislation', 'http://extwprlegs1.fao.org/docs/texts/par42993.doc', '<?xml version="1.0" encoding="UTF-8"?>\ [05/Dec/2021 17:04:31] DEBUG [django.db.backends:89] (0.845) None; args=('LEX-FAOC050711', '2021-12-05 15:12:35.169401') [05/Dec/2021 17:04:31] DEBUG [django.db.backends:89] (0.648) None; args=('LEX-FAOC050711', 'legislation', 'http://extwprlegs1.fao.org/docs/pdf/eur50711.pdf', '<?xml version="1.0" encoding="UTF-8"?>\n< [05/Dec/2021 17:04:32] DEBUG [django.db.backends:89] (0.011) None; args=None
Traceback (most recent call last): File "./manage.py", line 10, in
execute_from_command_line(sys.argv)
File "/usr/local/lib/python3.6/site-packages/django/core/management/init.py", line 353, in execute_from_command_line
utility.execute()
File "/usr/local/lib/python3.6/site-packages/django/core/management/init.py", line 345, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/usr/local/lib/python3.6/site-packages/django/core/management/base.py", line 348, in run_from_argv
self.execute(*args, *cmd_options)
File "/usr/local/lib/python3.6/site-packages/django/core/management/base.py", line 399, in execute
output = self.handle(args, **options)
File "/home/web/ecolex/ecolex/management/commands/import_xml.py", line 24, in handle
response = harvest_file(legislation_file.read())
File "/home/web/ecolex/ecolex/legislation.py", line 228, in harvest_file
add_legislations(legislations, count_ignored)
File "/home/web/ecolex/ecolex/legislation.py", line 278, in add_legislations
doc.save()
File "/usr/local/lib/python3.6/site-packages/django/db/models/base.py", line 708, in save
force_update=force_update, update_fields=update_fields)
File "/usr/local/lib/python3.6/site-packages/django/db/models/base.py", line 736, in save_base
updated = self._save_table(raw, cls, force_insert, force_update, using, update_fields)
File "/usr/local/lib/python3.6/site-packages/django/db/models/base.py", line 801, in _save_table
forced_update)
File "/usr/local/lib/python3.6/site-packages/django/db/models/base.py", line 851, in _do_update
return filtered._update(values) > 0
File "/usr/local/lib/python3.6/site-packages/django/db/models/query.py", line 645, in _update
return query.get_compiler(self.db).execute_sql(CURSOR)
File "/usr/local/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 1149, in execute_sql
cursor = super(SQLUpdateCompiler, self).execute_sql(result_type)
File "/usr/local/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 848, in execute_sql
cursor.execute(sql, params)
File "/usr/local/lib/python3.6/site-packages/django/db/backends/utils.py", line 79, in execute
return super(CursorDebugWrapper, self).execute(sql, params)
File "/usr/local/lib/python3.6/site-packages/sentry_sdk/integrations/django/init.py", line 500, in execute
return real_execute(self, sql, params)
File "/usr/local/lib/python3.6/site-packages/django/db/backends/utils.py", line 64, in execute
return self.cursor.execute(sql, params)
File "/usr/local/lib/python3.6/site-packages/django/db/utils.py", line 95, in exit
six.reraise(dj_exc_type, dj_exc_value, traceback)
File "/usr/local/lib/python3.6/site-packages/django/utils/six.py", line 685, in reraise
raise value.with_traceback(tb)
File "/usr/local/lib/python3.6/site-packages/django/db/backends/utils.py", line 64, in execute
return self.cursor.execute(sql, params)
File "/usr/local/lib/python3.6/site-packages/django/db/backends/mysql/base.py", line 112, in execute
return self.cursor.execute(query, args)
File "/usr/local/lib/python3.6/site-packages/MySQLdb/cursors.py", line 255, in execute
self.errorhandler(self, exc, value)
File "/usr/local/lib/python3.6/site-packages/MySQLdb/connections.py", line 50, in defaulterrorhandler
raise errorvalue
File "/usr/local/lib/python3.6/site-packages/MySQLdb/cursors.py", line 252, in execute
res = self._query(query)
File "/usr/local/lib/python3.6/site-packages/MySQLdb/cursors.py", line 378, in _query
db.query(q)
File "/usr/local/lib/python3.6/site-packages/MySQLdb/connections.py", line 280, in query
_mysql.connection.query(self, query)
django.db.utils.OperationalError: (2006, 'MySQL server has gone away')