golemfactory / concent

Repository for Concent Service sources
8 stars 7 forks source link

Concent responds with `ForceGetTaskResultRejected` to a valid `ForceGetTaskResult` #519

Closed shadeofblue closed 6 years ago

shadeofblue commented 6 years ago

When sending a ForceGetTaskResultRejected message to the Concent, Golem Client receives:

ForceGetTaskResultRejected(header=MessageHeader(type_=4012, timestamp=1529330495, encrypted=True), sig=b'V\x8c\x1d\xc9\x
f4\xcd}\xb4\xa9\xfe\x1c\xc0\x9d\xb3\xccq\x8f\x9f\x15\x05\xdb@\xc0jyD\x9e\x03\x8b\x19`Z$Z\xef\xe6\xa1\x08\xe4\x05\x87\x8a\x05\xce\x9e\x1f\xf88Q\xc4n*\x84\x1b\xe5\xef\x05\xc3\t\xb5L\xb7\xa9?\x01', slots=[['force_g
et_task_result', None], ['reason', 'acceptance_time_limit_exceeded']])

The ForceGetTaskResult message:

ForceGetTaskResult(header=MessageHeader(type_=4009, timestamp=1529330495, encrypted=False), sig=None, slots=
[['report_computed_task', ReportComputedTask(header=MessageHeader(type_=2004, timestamp=1529330495, encrypted=False), sig=b'!\xaf\xf8\xcfn\xa4\xf9\xab\n1\x1b<\x83\xfe\xf0=OO\xfc\xd3\x87}\x11#8\x99#^>\xe6\x11Ggl\
x04o\xe4\x8e\xb1\x00\xbf\x7fnT\x12D\x17\x99w\n\xe5\x1e\xfe$4k\xb9\xfc\xe8\xdcs#\x13\x88\x00', slots=[['result_type', 1], ['node_name', ''], ['address', '10.30.8.141'], ['node_info', {'hyperdrive_prv_port': 3282,
 'hyperdrive_pub_port': 3282, 'key': '627f6a8fdd03bb97dd901c97cd3fb6ff797b17dff8164e0b21a4f1b0e238d97bb120eed3eeffa82b38ca559935be0787172222535a472e1c362706128ae527c6', 'nat_type': 'Symmetric NAT', 'node_name': 
'', 'p2p_prv_port': 40104, 'p2p_pub_port': 40104, 'port_statuses': {3282: 'timeout', 40104: 'timeout', 40105: 'timeout'}, 'prv_addr': '10.30.8.141', 'prv_addresses': ['10.30.8.141', '192.168.246.1', '192.168.107
.1', '172.17.0.1'], 'prv_port': 40105, 'pub_addr': '194.181.80.91', 'pub_port': 40105}], ['port', 40105], ['key_id', '627f6a8fdd03bb97dd901c97cd3fb6ff797b17dff8164e0b21a4f1b0e238d97bb120eed3eeffa82b38ca559935be0
787172222535a472e1c362706128ae527c6'], ['extra_data', ['test task_10001.png', 'stdout.log', 'stderr.log']], ['eth_account', '0x2bFa2e5D25557f9d0f965EDAb8d2ceC1Ab2f97Cc'], ['task_to_compute', TaskToCompute(header
=MessageHeader(type_=2002, timestamp=1529330484, encrypted=False), sig=b'\xdf\xa0\xed\xd8-\x94!M\xe9\xa5\x96Lc\xeeH\xe2\x07/\xb3\xc8d\x98g\x9d\xf5\xdd\xa4`\xf6\xefoLb>8;"\xee\xbc*\x10\x02\xbb7\x9c)"\x04\x81\x1d\
xd1x\xed\xb8\xa8>\xf4\xfd\x9a\xc3\x0c\xc8\xfc\xc6\x00', slots=[['requestor_id', '31f528988577db5c34355fc94719d2ff8f1747452c6ba5d26d3966a52ddbe2e1f8128952ce0dcd02ca744063803a4d38b99c59e42c368b91eb42ba7d59997241']
, ['requestor_public_key', '31f528988577db5c34355fc94719d2ff8f1747452c6ba5d26d3966a52ddbe2e1f8128952ce0dcd02ca744063803a4d38b99c59e42c368b91eb42ba7d59997241'], ['requestor_ethereum_public_key', '31f528988577db5c
34355fc94719d2ff8f1747452c6ba5d26d3966a52ddbe2e1f8128952ce0dcd02ca744063803a4d38b99c59e42c368b91eb42ba7d59997241'], ['provider_id', '627f6a8fdd03bb97dd901c97cd3fb6ff797b17dff8164e0b21a4f1b0e238d97bb120eed3eeffa8
2b38ca559935be0787172222535a472e1c362706128ae527c6'], ['provider_public_key', '627f6a8fdd03bb97dd901c97cd3fb6ff797b17dff8164e0b21a4f1b0e238d97bb120eed3eeffa82b38ca559935be0787172222535a472e1c362706128ae527c6'], 
['provider_ethereum_public_key', '627f6a8fdd03bb97dd901c97cd3fb6ff797b17dff8164e0b21a4f1b0e238d97bb120eed3eeffa82b38ca559935be0787172222535a472e1c362706128ae527c6'], ['compute_task_def', {'deadline': 1529331015.
534439, 'docker_images': [{'image_id': None, 'repository': 'golemfactory/blender', 'tag': '1.4'}], 'extra_data': {'end_task': 1, 'frames': [1], 'outfilebasename': 'test task', 'output_format': 'PNG', 'path_root'
: '/home/blue/PycharmProjects/golem/scripts/concent_node_tests/tasks/test_task_1', 'scene_file': '/golem/resources/wlochaty3.blend', 'script_src': '# This template is rendered by\n# apps.blender.resources.scenef
ileeditor.generate_blender_crop_file(),\n# written to tempfile and passed as arg to blender.\nimport bpy\n\nclass EngineWarning(bpy.types.Operator):\n    bl_idname = "wm.engine_warning"\n    bl_label = "Inform a
bout not supported rendering engine"\n\n    def execute(self, context):\n        self.report({"ERROR"}, "Engine " + bpy.context.scene.render.engine + \\\n                               " not supported by Golem")
\n        return {"FINISHED"}\n\nclass ShowInformation(bpy.types.Operator):\n    bl_idname = "wm.scene_information"\n    bl_label = "Inform user about scene settings"\n\n\n    def execute(self, context):\n      
  self.report({"INFO"}, "Engine: " +\n                              str(bpy.context.scene.render.engine))\n        if bpy.context.scene.render.engine == "CYCLES":\n            self.report({"INFO"}, "Samples: " +
 str(bpy.context.scene.cycles.samples))\n        self.report({"INFO"}, "Resolution: " +\n                              str(bpy.context.scene.render.resolution_x) +\n                               " x " +\n      
                         str(bpy.context.scene.render.resolution_y))\n        self.report({"INFO"}, "File format: " +\n                               str(bpy.context.scene.render.file_extension))\n        self.r
eport({"INFO"}, "Filepath: " +\n                              str(bpy.context.scene.render.filepath))\n        self.report({"INFO"}, "Frames: " +\n                              str(bpy.context.scene.frame_start)
 + "-" +\n                              str(bpy.context.scene.frame_end) + ";" +\n                              str(bpy.context.scene.frame_step))\n\n        return {"FINISHED"}\n\n\nbpy.utils.register_class(Eng
ineWarning)\nengine = bpy.context.scene.render.engine\nif engine not in ("BLENDER_RENDER", "CYCLES"):\n    bpy.ops.wm.engine_warning()\n\nbpy.utils.register_class(ShowInformation)\nbpy.ops.wm.scene_information()
\n\nbpy.context.scene.render.tile_x = 0\nbpy.context.scene.render.tile_y = 0\nbpy.context.scene.render.resolution_x = 320\nbpy.context.scene.render.resolution_y = 240\nbpy.context.scene.render.resolution_percent
age = 100\nbpy.context.scene.render.use_border = True\nbpy.context.scene.render.use_crop_to_border = True\nbpy.context.scene.render.border_max_x = 1.0\nbpy.context.scene.render.border_min_x = 0.0\nbpy.context.sc
ene.render.border_min_y = 0.0\nbpy.context.scene.render.border_max_y = 1.0\nbpy.context.scene.render.use_compositing = bool(False)\nif engine == "CYCLES":\n    samples = 0\n    if samples != 0:\n        bpy.cont
ext.scene.cycles.samples = samples\n\n#and check if additional files aren\'t missing\nbpy.ops.file.report_missing_files()\n', 'start_task': 1, 'total_tasks': 1}, 'performance': 476.9116316324916, 'short_descript
ion': 'path_root: /home/blue/PycharmProjects/golem/scripts/concent_node_tests/tasks/test_task_1, start_task: 1, end_task: 1, total_tasks: 1, outfilebasename: test task, scene_file: /golem/resources/wlochaty3.ble
nd', 'src_code': 'from __future__ import print_function\n\nimport os\nimport subprocess\nimport sys\nfrom multiprocessing import cpu_count\n\nimport params  # This module is generated before this script is run\n
\nBLENDER_COMMAND = "blender"\nWORK_DIR = "/golem/work"\nOUTPUT_DIR = "/golem/output"\n\n\ndef exec_cmd(cmd):\n    pc = subprocess.Popen(cmd)\n    return pc.wait()\n\n\ndef format_blender_render_cmd(outfilebasen
ame, scene_file, script_file,\n                              start_task, frame, output_format):\n    cmd = [\n        "{}".format(BLENDER_COMMAND),\n        "-b", "{}".format(scene_file),\n        "-y",  # enabl
e scripting by default\n        "-P", "{}".format(script_file),\n        "-o", "{}/{}_{}".format(OUTPUT_DIR, outfilebasename, start_task),\n        "-noaudio",\n        "-F", "{}".format(output_format.upper()),\
n        "-t", "{}".format(cpu_count()),\n        "-f", "{}".format(frame)\n    ]\n    return cmd\n\n\ndef run_blender_task(outfilebasename, scene_file, script_src, start_task,\n                     frames, outp
ut_format):\n    scene_file = os.path.normpath(scene_file)\n    if not os.path.exists(scene_file):\n        print("Scene file \'{}\' does not exist".format(scene_file),\n              file=sys.stderr)\n        s
ys.exit(1)\n\n    blender_script_path = WORK_DIR + "/blenderscript.py"\n    with open(blender_script_path, "w") as script_file:\n        script_file.write(script_src)\n\n    for frame in frames:\n        cmd = f
ormat_blender_render_cmd(outfilebasename, scene_file,\n                                        script_file.name, start_task, frame, output_format)\n        print(cmd, file=sys.stderr)\n        exit_code = exec_c
md(cmd)\n        if exit_code is not 0:\n            sys.exit(exit_code)\n\n\nrun_blender_task(params.outfilebasename, params.scene_file, params.script_src, params.start_task, params.frames,\n                 pa
rams.output_format)\n\n', 'subtask_id': '161a5edc-7300-11e8-b8c7-31f528988577', 'task_id': 'ed203d18-72ff-11e8-ad1e-31f528988577', 'working_directory': '.'}], ['package_hash', 'sha1:245ad6ad2264a31bcb37db8352f6c
5d0a319ded0'], ['size', 44720896], ['concent_enabled', True], ['price', 163888888888888889]])], ['size', 261056], ['package_hash', 'sha1:42ff7c7d43c6865e3ad589fade4b2a46d1b42bca'], ['multihash', '2df2c39f408df3e
6f72f8ad50900f09d51f18bfedaf31dbeb3664e06d482f162'], ['secret', b'\x15W\xc6\r^\xcb\xf0\xc2yy\x82\x13\xd7\xf4\x11\xdfIO\xca1\xf2Q\x85\x86\x06#J\xd1'], ['options', {'client_id': 'hyperg', 'options': {'peers': [{'T
CP': ['10.30.8.141', 3282]}, {'TCP': ['10.30.8.141', 3282]}, {'TCP': ['192.168.246.1', 3282]}, {'TCP': ['192.168.107.1', 3282]}, {'TCP': ['194.181.80.91', 3282]}, {'TCP': ['172.17.0.1', 3282]}]}, 'version': 1.1}
]])]])

and the SVT here is ~4 hours whereas the message was sent after a few seconds from the reception of the RCT, after the Requestor node failed to download the results from the Provider

shadeofblue commented 6 years ago

to reproduce, you can run: ./tests/test_force_download.py from the concent_node_tests suite

shadeofblue commented 6 years ago

ok, I believe it's because the timestamp of FRCT equals the timestamp of RCT ... I think it should still be legal to send the FRCT (almost) immediately after having received RCT (after some critical download failure) but as far as tests are concerned, adding a short wait in the test fixes the issue

cameel commented 6 years ago

@dybi this may be of interest to you. The inequalities are back :)

I think it should still be legal to send the FRCT (almost) immediately after having received RCT (after some critical download failure)

You mean FGTR, right?

The spec does not say anything about it so we had to make assumptions. In Concent the time intervals are left-open, right-closed. So the provider is safe from FGRT until RCT.timestamp. Only after that point we start accepting FGTR.

It's not set in stone and we can change it but let's do it consistently everywhere.

shadeofblue commented 6 years ago

@cameel yes, FGTR, sorry

shadeofblue commented 6 years ago

@cameel would it result in any negative consequences for the time intervals to to be closed on both ends? ...

cameel commented 6 years ago

I wanted to avoid the possibility of overlap. The one negative consequence would be that if we had two intervals next to each other we would have a point of overlap where the result would not be deterministic (it would depend on which message comes first). For example interval [1, 2] where we expect message A and interval [2, 3] where we expect B if we did not receive A. Receiving both A and B at 2 would be a problem.

But after looking at the timing diagram I don't see any case of this so changing time intervals to closed on both sides is doable. But there's still a small risk that something like this will come up due to a future changes in the spec.

cameel commented 6 years ago

I'm closing this one since we now know the cause.

After discussing this on Rocket the consensus is that we can safely remove the lower time limit in the result transfer use case. I have created a separate issue for that: #532.