golemfactory / clay

Golem is creating a global market for computing power.
https://golem.network
GNU General Public License v3.0
2.91k stars 286 forks source link

Cannot create docker volume, requesting on Windows #4246

Closed ederenn closed 5 years ago

ederenn commented 5 years ago

Description

Golem Version: 0.19.2+dev518.gd81005c

Golem-Messages version (leave empty if unsure): 3.4.0

Electron version (if used): 0.20

OS [e.g. Windows 10 Pro]: Windows 10 Pro

Branch (if launched from source): develop

Mainnet/Testnet: mainnet

Description of the issue:

When requesting on Windows machine Golem is not accepting results of verification. Every subtask ends in failure.

(golem-env) C:\Users\ederenn\projects\golem>python golemcli.py --mainnet subtasks show 77dc7c7a-8141-11e9-ab61-845cd9f61ab5
deadline: 1559045967
extra_data: {'scene_file': '/golem/resources/scene-Helicopter-27-cycles.blend', 'resolution': [400, 400], 'use_compositing': False, 'samples': 0, 'frames': [1], 'output_format': 'PNG', 'path_root': 'D:\\test files', 'start_task': 1, 'total_tasks': 2, 'crops': [{'outfilebasename': 'scene-Helicopter-2_1', 'borders_x': [0.0, 1.0], 'borders_y': [0.5, 1.0]}], 'entrypoint': 'python3 /golem/entrypoints/render_entrypoint.py'}
node_id: 6c15fa96c1253ac738f327b853669cfaa0b4d75a77be25830c70b48a6493fef842bfbd8d8ccc2372bdabd775a1384f93df0a379ac03ad9520d051e09d6a66c61
node_name: oncoming storm
price: 100000000000000000
progress: 1.0
results: ['C:\\Users\\ederenn\\AppData\\Local\\golem\\golem\\default\\mainnet\\ComputerRes\\6ae216d2-8141-11e9-ba98-845cd9f61ab5\\tmp\\scene-Helicopter-2_10001.png']
status: Failure
stderr: [GOLEM] Not accepted
stdout: C:\Users\ederenn\AppData\Local\golem\golem\default\mainnet\ComputerRes\6ae216d2-8141-11e9-ba98-845cd9f61ab5\tmp\77dc7c7a-8141-11e9-ab61-845cd9f61ab5\stdout.log
subtask_id: 77dc7c7a-8141-11e9-ab61-845cd9f61ab5
time_started: 1559045379

Actual result:

Golem won't finish any task.

Screenshots:

If applicable, add screenshots to help explain your problem.

Steps To Reproduce

  1. Launch two nodes on private, debug mode, Linux for providing, Windows for requesting
  2. Start a task on Windows node, 400x400 1 frame, two subtasks.
  3. After subtask fails check subtask from cli
  4. Check logs

Expected behavior

Subtasks should be passing verification

Logs and any additional context

(golem-env) C:\Users\ederenn\projects\golem>python golemcli.py --mainnet subtasks show 77dc7c7a-8141-11e9-ab61-845cd9f61ab5
deadline: 1559045967
extra_data: {'scene_file': '/golem/resources/scene-Helicopter-27-cycles.blend', 'resolution': [400, 400], 'use_compositing': False, 'samples': 0, 'frames': [1], 'output_format': 'PNG', 'path_root': 'D:\\test files', 'start_task': 1, 'total_tasks': 2, 'crops': [{'outfilebasename': 'scene-Helicopter-2_1', 'borders_x': [0.0, 1.0], 'borders_y': [0.5, 1.0]}], 'entrypoint': 'python3 /golem/entrypoints/render_entrypoint.py'}
node_id: 6c15fa96c1253ac738f327b853669cfaa0b4d75a77be25830c70b48a6493fef842bfbd8d8ccc2372bdabd775a1384f93df0a379ac03ad9520d051e09d6a66c61
node_name: oncoming storm
price: 100000000000000000
progress: 1.0
results: ['C:\\Users\\ederenn\\AppData\\Local\\golem\\golem\\default\\mainnet\\ComputerRes\\6ae216d2-8141-11e9-ba98-845cd9f61ab5\\tmp\\scene-Helicopter-2_10001.png']
status: Failure
stderr: [GOLEM] Not accepted
stdout: C:\Users\ederenn\AppData\Local\golem\golem\default\mainnet\ComputerRes\6ae216d2-8141-11e9-ba98-845cd9f61ab5\tmp\77dc7c7a-8141-11e9-ab61-845cd9f61ab5\stdout.log
subtask_id: 77dc7c7a-8141-11e9-ab61-845cd9f61ab5
time_started: 1559045379

from golem.log:

2019-05-27 16:37:38 INFO     apps.core.verification_queue        Running verification of subtask 'f0d4e1fe-808c-11e9-abcd-845cd9f61ab5'
2019-05-27 16:37:38 WARNING  golem.task.taskthread               Task computing error Cannot create docker volume
2019-05-27 16:37:38 INFO     apps.core.verification_queue        Running verification of subtask 'f0d4e1fe-808c-11e9-abcd-845cd9f61ab5'
2019-05-27 16:37:38 WARNING  golem.task.taskthread               Task computing error Cannot create docker volume
2019-05-28 14:33:28 DEBUG    apps.core.verification_queue        Verification Queue submit: (verifier_class: functools.partial(<class 'golem.verificator.blender_verifier.BlenderVerifier'>, docker_task_cls=<class 'golem.docker.task_thread.DockerTaskThread'>), subtask: aed8dccc-8144-11e9-a54c-845cd9f61ab5, deadline: 1559047325, kwargs: {'subtask_info': {'scene_file': '/golem/resources/scene-Helicopter-27-cycles.blend', 'resolution': [400, 400], 'use_compositing': False, 'samples': 0, 'frames': [1], 'output_format': 'PNG', 'path_root': 'D:\\test files', 'start_task': 1, 'total_tasks': 2, 'crops': [{'outfilebasename': 'scene-Helicopter-3_1', 'borders_x': [0.0, 1.0], 'borders_y': [0.5, 1.0]}], 'entrypoint': 'python3 /golem/entrypoints/render_entrypoint.py', 'subtask_id': 'aed8dccc-8144-11e9-a54c-845cd9f61ab5', 'status': <SubtaskStatus.verifying: 'Verifying'>, 'node_id': '6c15fa96c1253ac738f327b853669cfaa0b4d75a77be25830c70b48a6493fef842bfbd8d8ccc2372bdabd775a1384f93df0a379ac03ad9520d051e09d6a66c61', 'parts': 1, 'res_x': 400, 'res_y': 400, 'use_frames': False, 'all_frames': [1], 'crop_window': (0.0, 1.0, 0.5, 1.0), 'subtask_timeout': 600, 'tmp_dir': 'C:\\Users\\ederenn\\AppData\\Local\\golem\\golem\\default\\mainnet\\ComputerRes\\9a5dd1ba-8144-11e9-816f-845cd9f61ab5\\tmp', 'ctd': {'task_id': '9a5dd1ba-8144-11e9-816f-845cd9f61ab5', 'subtask_id': 'aed8dccc-8144-11e9-a54c-845cd9f61ab5', 'deadline': 1559047325, 'src_code': '', 'extra_data': {'scene_file': '/golem/resources/scene-Helicopter-27-cycles.blend', 'resolution': [400, 400], 'use_compositing': False, 'samples': 0, 'frames': [1], 'output_format': 'PNG', 'path_root': 'D:\\test files', 'start_task': 1, 'total_tasks': 2, 'crops': [{'outfilebasename': 'scene-Helicopter-3_1', 'borders_x': [0.0, 1.0], 'borders_y': [0.5, 1.0]}], 'entrypoint': 'python3 /golem/entrypoints/render_entrypoint.py'}, 'performance': 486.1714084290134, 'docker_images': [{'repository': 'golemfactory/blender', 'image_id': None, 'tag': '1.9'}], 'resources': [['66d06098773043bac9b1958e8f79fdbe3ecdfea64543f09307391e8941b5c079', [['aed8dccc-8144-11e9-a54c-845cd9f61ab5']]]]}, 'owner': '845cd9f61ab53a9a2bc02dca752b565d0e277c8b25b6aa1c148c4cae76f7682cfdbd980689314a8def37e299d4cf9fb76c0b78f99fd27c607fc6404cfb9fe861'}, 'results': ['C:\\Users\\ederenn\\AppData\\Local\\golem\\golem\\default\\mainnet\\ComputerRes\\9a5dd1ba-8144-11e9-816f-845cd9f61ab5\\tmp\\scene-Helicopter-3_10001.png'], 'resources': ['D:\\test files\\scene-Helicopter-27-cycles.blend']})

Proposed Solution?

(Optional: What could be a solution for that issue)

maaktweluit commented 5 years ago

it seems to mount the original location of the scene file directly into docker:

path_root is set to main_scene_dir https://github.com/golemfactory/golem/blob/d81005cdfeea28c3ba2f4a94f17f8c925a3766ba/apps/blender/task/blenderrendertask.py#L395

then path_root is used to mount as /golem/resources https://github.com/golemfactory/golem/blob/d81005cdfeea28c3ba2f4a94f17f8c925a3766ba/golem/verificator/blender_verifier.py#L77

while it used to be properly inside the work_dir: https://github.com/golemfactory/golem/pull/3844/files#diff-187d6b2b5c0334f8119154f3d302c14cL141

Wiezzel commented 5 years ago

I came to the same conclusion as @maaktweluit. Only subdirectories of "golem\default\mainnet\ComputerRes" can be mounted.

maaktweluit commented 5 years ago

@Wiezzel thanks for the formatting fix, was wondering how that worked, now i know :)

maaktweluit commented 5 years ago

found one more thing when checking the golem\default\mainnet\ComputerRes\<subtask_id> folder.

It contains a resources\<subtask_id> file with the size of the blender file, maybe it can/should use this for verification

weaselix commented 5 years ago

It looks like someone was trying to introduce the following optimization:

Let's avoid copying resources before sending them to provider and just pack them to the ZIP in place.

However, it does not work when it comes to the verification process. A verificator assumes that resources are copied to ComputerRes.

Issue was resolved by copying needed resources to ComputerRes but, in the future, we need to resolve that in another way. From the other hand, mounting user directories directly to virtualmachine / docker filesystem seems to be not reasonable.

Thanks @prekucki for your help

weaselix commented 5 years ago

https://github.com/golemfactory/golem/pull/4251