MetPX / sarracenia

https://MetPX.github.io/sarracenia
GNU General Public License v2.0
45 stars 22 forks source link

`ComputeIdentity` crashing instance when xattr identity not found #1151

Closed andreleblanc11 closed 2 months ago

andreleblanc11 commented 2 months ago

The instance will crash when xattr can't find the identity when it performs a listxattr. identity will be null value

Traceback (most recent call last):
  File "/net/local/home/leblanca/.local/lib/python3.10/site-packages/sarracenia/instance.py", line 249, in <module>
    i.start()
  File "/net/local/home/leblanca/.local/lib/python3.10/site-packages/sarracenia/instance.py", line 240, in start
    self.running_instance.run()
  File "/net/local/home/leblanca/.local/lib/python3.10/site-packages/sarracenia/flow/__init__.py", line 600, in run
    self.work()
  File "/net/local/home/leblanca/.local/lib/python3.10/site-packages/sarracenia/flow/__init__.py", line 1228, in work
    self._runCallbacksWorklist('after_work')
  File "/net/local/home/leblanca/.local/lib/python3.10/site-packages/sarracenia/flow/__init__.py", line 300, in _runCallbacksWorklist
    p(self.worklist)
  File "/net/local/home/leblanca/.config/sr3/plugins/rename/sumstr_ext.py", line 65, in after_work
    msg.computeIdentity(old_path,self.o)
  File "/net/local/home/leblanca/.local/lib/python3.10/site-packages/sarracenia/__init__.py", line 431, in computeIdentity
    if fxainteg['method'] == o.identity_method:
TypeError: string indices must be integers

https://github.com/MetPX/sarracenia/blob/7c6313cc60da9f9765ca1d16abe21fae03025fbd/sarracenia/filemetadata.py#L137-L151 https://github.com/MetPX/sarracenia/blob/7c6313cc60da9f9765ca1d16abe21fae03025fbd/sarracenia/__init__.py#L416-L435

andreleblanc11 commented 2 months ago

The problem was deeper than originally anticipated.

The cod,sha512 identity would not calculate on download leaving the checksum empty in the sarracenia message. This would also make the extended attributes (from xattr) generated by sarracenia be incorrect for the identity attributed to the file.

See an example below. Before the patch, the message would not have the correct identity fields attributed.

2024-08-02 19:26:32,516 [CRITICAL] rename.sumstr_ext after_work This is is the message {'_format': 'v02', '_deleteOnPost': {'old_relPath', 'new_inflight_path', 'old_subtopic', 'new_baseUrl', 'new_path',
'data_checksum', 'new_relPath', 'new_subtopic', 'onfly_checksum', 'old_format', 'report', '_matches', 'new_file', 'subtopic', 'post_format', '_format', 'exchange', 'local_offset', 'old_baseUrl', 'source', 'new_dir'}, 'to_clusters': 'ALL', 'sundew_extension': 'AQ:ONT:AIRNOW:ASCII:', 'from_cluster': 'DDSR.CMC-DEV', 'source': 'PROVINCIAL', 'mtime': '20240802T181908', 'atime': '20240802T181941', 'pubTime': '20240802T192630.8837955', 'baseUrl': 'http://ddsr-cmc-dev02.cmc.ec.gc.ca/', 'relPath': '20240802/PROVINCIAL/AIRNOW/ASCII/ONT/19/080213.ont', 'subtopic': ['20240802', 'PROVINCIAL', 'AIRNOW', 'ASCII', 'ONT', '19'], 'identity': {'method': 'cod', 'value': 'sha512'}, 'size': 46609, 'exchange': 'xs_PROVINCIAL', 'local_offset': 0, '_matches': <_sre.SRE_Match object; span=(0, 77), match='sftp://pds@my-destinateion/data/depot/pds/on>, 'new_dir': '/apps/sarra/public_data/20240802/PROVINCIAL/AIRNOW/ASCII/ONT/19', 'new_file': '080213.ont', 'post_format': 'v02', 'new_baseUrl': 'http://my-host.cmc.ec.gc.ca/', 'new_relPath': '20240802/PROVINCIAL/AIRNOW/ASCII/ONT/19/080213.ont', 'new_subtopic': ['20240802', 'PROVINCIAL', 'AIRNOW', 'ASCII', 'ONT', '19'], 'new_inflight_path': '080213.ont', 'new_path': '/apps/sarra/public_data/20240802/PROVINCIAL/AIRNOW/ASCII/ONT/19/080213.ont', 'contentType': 'text/plain', 'onfly_checksum': None, 'data_checksum': None, 'report': {'code': 201, 'timeCompleted': '20240802T192632.515742064', 'message': 'Download successful'}, 'old_baseUrl': 'sftp://pds@px-paz3.cmc.ec.gc.ca/', 'old_relPath': 'data/depot/pds/on_airqual/incoming/080213.ont', 'old_subtopic': ['data', 'depot', 'pds', 'on_airqual', 'incoming'], 'old_format': 'v02'}