Closed ericzinnikas closed 5 years ago
Does this happen only when there are failures? or all the time? FWIW, I haven't seen this happen before when using PSQ, though I can't think of why that would affect it.
Happens all the time. Inserted some prints to debug and even after the first job finishes (plaso) there's no request id in the result object that is returned. Maybe I broke something with the serialization code I added. Will test again before that diff and report back.
On Mon, Dec 3, 2018, 18:44 Aaron Peterson <notifications@github.com wrote:
Does this happen only when there are failures? or all the time? FWIW, I haven't seen this happen before when using PSQ, though I can't think of why that would affect it.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/google/turbinia/issues/313#issuecomment-443949733, or mute the thread https://github.com/notifications/unsubscribe-auth/ADPKpjF3IaB7k6o1lQweJy3uvVF2ERn6ks5u1eGGgaJpZM4Y8lgr .
Does this still happen?
Working as expected now.
I haven't had a chance to dig into this further, but I'm noticing that
request_id
is not passed onto child evidence objects or tasks. I'm not sure if this is specific to Celery or PSQ as well.For example (the following logs are from some debug statements on the server), sending a RawDisk evidence for processing we see the following PlasoTask / RawDisk objects created:
{'result': None, 'tmp_dir': None, 'name': 'PlasoTask', 'run_local': False, '_evidence_config': {}, 'base_output_dir': u'/evidence/output', 'state_key': u'TurbiniaTask:5216aea82a2b48b4bfb1601e84c16451', 'last_update': '2018-11-30 15:54:30.436344', 'stub': None, 'user': 'ericwz', 'request_id': u'731123b983434cb59514caf327cd3129', 'output_manager': {'_output_writers': None, 'is_setup': False}, 'id': '5216aea82a2b48b4bfb1601e84c16451', 'output_dir': None}
{u'mount_partition': 1, u'type': u'RawDisk', u'mount_path': None, u'tags': {}, u'processed_by': [], u'copyable': False, u'saved_path_type': None, u'name': u'test', u'source': u'example', u'saved_path': None, u'loopdevice_path': None, u'request_id': u'731123b983434cb59514caf327cd3129', u'local_path': u'/Users/ericwz/SCHARDT.dd', u'size': None, u'config': {}, u'cloud_only': False, u'description': None}
But then once that completes, the following PsortTask/Plaso file objects now have
request_id: None
:{'result': None, 'tmp_dir': None, 'name': 'PsortTask', 'run_local': False, '_evidence_config': {}, 'base_output_dir': u'/evidence/output', 'state_key': u'TurbiniaTask:92bef4ad649d48cd91b575692588c880', 'last_update': '2018-11-30 15:57:32.114775', 'stub': None, 'user': 'ericwz', 'request_id': None, 'output_manager': {'_output_writers': None, 'is_setup': False}, 'id': '92bef4ad649d48cd91b575692588c880', 'output_dir': None}
{u'plaso_version': None, u'description': None, u'tags': {}, u'type': u'PlasoFile', u'copyable': True, u'source': None, u'saved_path': None, u'saved_path_type': None, u'request_id': None, u'local_path': u'/evidence/output/1543622070-6306431d0f8a4a8d830af57e4b502652-PlasoTask/6306431d0f8a4a8d830af57e4b502652.plaso', u'config': {}, u'processed_by': [], u'cloud_only': False, u'name': u'PlasoFile'}
I see in
workers/__init__.py
line 139 we do copy over the request_id when closing a task, but apparently at this point the TurbiniaTaskResult object's request_id is already None as well. So I'm not sure where it is getting lost. If I have time later today I'll try to track it down & update here.