Open jarrodharvey opened 6 years ago
I attempted to reproduce this issue but I was not able to in my one test. I tried both Archivematica 1.7.1 and a deployment from qa/1.x.
I edited the XMP metadata of a sample pdf file and added an ampersand into the Contributor metadata field (that field was previously blank in my test pdf). In the stdout showing in the task details for the characterize and extract job, I can see the ampersand displayed:
Producer Adobe PDF Library 9.0
Title Technology responsiveness for digital preservation: a model
Contributor Nance & friends
Creator N.Y. McGovern
PageLayout OneColumn
PageCount 306
Perhaps I need to insert the ampersand into a custom metadata field? I am not sure how to do that. @jarrodharvey are you able to share a sample file that reproduces this problem?
Thank you for helping test this, jhsimpson.
Were you testing using '&' or '&'? The latter is a 'different' version of the ampersand symbol (not sure of the proper terminology here) and that's the one that both of my 1.7.1 testing environments break on in the exact same way. I should have specified.
Expected behaviour Archivematica workflow should run until completion.
Current behaviour Archivematica hangs at this point:
MCPServer.debug.log contains the following error message:
< field name="Correspondence To - iMIS Number">\r\n \r\n \r\n \r\n 542facba-391c-4236-9f41-509c46f20258 \r\n \r\n \r\n \r\n \r\n \r\n \r\n \r\n \r\n \r\n \r\n \r\n \r\n \r\n \r\n \r\n \r\n \r\n \r\n \r\n \r\n \r\n \r\n \r\n D:20160726004730 \r\n \r\n \r\n 3;#Form|e2231b15-9433-4e50-9278-a5702bf4dd62 \r\n \r\n \r\n Elizabeth Milford \r\n \r\n \r\n Adobe PDF Library 10.0 \r\n \r\n \r\n \r\n \r\n\r\n\r\n\n', 'exitCode': 1, 'stdError': '\nTraceback ( most recent call last):\n File "/usr/lib/archivematica/MCPClient/clientScripts/characterizeFile.py", line 101, in \n sys.exit(main(file_path, file_uuid, sip_uuid))\n File "/usr/lib/archivemati ca/MCPClient/clientScripts/characterizeFile.py", line 81, in main\n insertIntoFPCommandOutput(file_uuid, stdout, rule.uuid)\n File "/usr/lib/archivematica/archivematicaCommon/databaseFunctions.py", lin e 211, in insertIntoFPCommandOutput\n rule_id=ruleUUID)\n File "/usr/share/archivematica/virtualenvs/archivematica-mcp-client/local/lib/python2.7/site-packages/django/db/models/manager.py", line 127, i n manager_method\n return getattr(self.get_queryset(), name)(*args, *kwargs)\n File "/usr/share/archivematica/virtualenvs/archivematica-mcp-client/local/lib/python2.7/site-packages/django/db/models/qu ery.py", line 348, in create\n obj.save(force_insert=True, using=self.db)\n File "/usr/share/archivematica/virtualenvs/archivematica-mcp-client/local/lib/python2.7/site-packages/django/db/models/base.p y", line 734, in save\n force_update=force_update, update_fields=update_fields)\n File "/usr/share/archivematica/virtualenvs/archivematica-mcp-client/local/lib/python2.7/site-packages/django/db/models/ base.py", line 762, in save_base\n updated = self._save_table(raw, cls, force_insert, force_update, using, update_fields)\n File "/usr/share/archivematica/virtualenvs/archivematica-mcp-client/local/lib /python2.7/site-packages/django/db/models/base.py", line 846, in _save_table\n result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)\n File "/usr/share/archivematica/virtualenvs/ar chivematica-mcp-client/local/lib/python2.7/site-packages/django/db/models/base.py", line 885, in _do_insert\n using=using, raw=raw)\n File "/usr/share/archivematica/virtualenvs/archivematica-mcp-client /local/lib/python2.7/site-packages/django/db/models/manager.py", line 127, in manager_method\n return getattr(self.get_queryset(), name)(args, **kwargs)\n File "/usr/share/archivematica/virtualenvs/ar chivematica-mcp-client/local/lib/python2.7/site-packages/django/db/models/query.py", line 920, in _insert\n return query.get_compiler(using=using).execute_sql(return_id)\n File "/usr/share/archivematic a/virtualenvs/archivematica-mcp-client/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 974, in execute_sql\n cursor.execute(sql, params)\n File "/usr/share/archivematica/virtu alenvs/archivematica-mcp-client/local/lib/python2.7/site-packages/django/db/backends/utils.py", line 64, in execute\n return self.cursor.execute(sql, params)\n File "/usr/share/archivematica/virtualenv s/archivematica-mcp-client/local/lib/python2.7/site-packages/django/db/utils.py", line 98, in exit\n six.reraise(dj_exc_type, dj_exc_value, traceback)\n File "/usr/share/archivematica/virtualenvs/a rchivematica-mcp-client/local/lib/python2.7/site-packages/django/db/backends/utils.py", line 64, in execute\n return self.cursor.execute(sql, params)\n File "/usr/share/archivematica/virtualenvs/archiv ematica-mcp-client/local/lib/python2.7/site-packages/django/db/backends/mysql/base.py", line 124, in execute\n return self.cursor.execute(query, args)\n File "/usr/share/archivematica/virtualenvs/archi vematica-mcp-client/local/lib/python2.7/site-packages/MySQLdb/cursors.py", line 250, in execute\n self.errorhandler(self, exc, value)\n File "/usr/share/archivematica/virtualenvs/archivematica-mcp-clie nt/local/lib/python2.7/site-packages/MySQLdb/connections.py", line 42, in defaulterrorhandler\n raise errorvalue\ndjango.db.utils.OperationalError: (1366, "Incorrect string value: \'\\xEF\\xBC\\x8 6 Re...\' for column \'content\' at row 1")\n'}
Steps to reproduce Put a '&' symbol into a PDF file's embedded metadata. Here is a live example from our environment:
Removing the & symbol allows the workflow to run normally.
Your environment (version of Archivematica, OS version, etc) Archivematica version 1.7.1, Ubuntu Xenial.