golemfactory / ray-on-golem

GNU General Public License v3.0
6 stars 4 forks source link

Information about wrong yaml file is not well presented to the end user #215

Open lucekdudek opened 6 months ago

lucekdudek commented 6 months ago

Wrong yaml example

  ...
  ray.worker.default:
    min_workers: 2
    resources: {}
    node_config: 
      # missing `demand:`
      min_mem_gib: 2
      min_cpu_threads: 4
      min_storage_gib: 2
      max_cpu_threads: 32
  ...

ray up SDTOUT/STDERR:

Server got itself in trouble

ray_on_golem.client.exceptions.RayOnGolemClientError: 500: Couldn't request nodes: 500 Internal Server Error
    raise RayOnGolemClientError(
  File "/usr/local/lib/python3.10/site-packages/ray_on_golem/client/client.py", line 228, in _make_request
    response = self._make_request(
  File "/usr/local/lib/python3.10/site-packages/ray_on_golem/client/client.py", line 39, in request_nodes
    requested_node_ids = self._ray_on_golem_client.request_nodes(
  File "/usr/local/lib/python3.10/site-packages/ray_on_golem/provider/node_provider.py", line 186, in create_node
    provider.create_node(head_node_config, head_node_tags, 1)
  File "/usr/local/lib/python3.10/site-packages/ray/autoscaler/_private/commands.py", line 763, in get_or_create_head_node
    get_or_create_head_node(
  File "/usr/local/lib/python3.10/site-packages/ray/autoscaler/_private/commands.py", line 317, in create_or_update_cluster
    create_or_update_cluster(
  File "/usr/local/lib/python3.10/site-packages/ray/scripts/scripts.py", line 1298, in up
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/ray/autoscaler/_private/cli_logger.py", line 856, in wrapper
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1078, in main
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return cli()
  File "/usr/local/lib/python3.10/site-packages/ray/scripts/scripts.py", line 2498, in main
    sys.exit(main())
  File "/usr/local/bin/ray", line 8, in <module>
Traceback (most recent call last):
2024-03-15 10:33:05,895 INFO node_provider.py:184 -- Requesting 1 nodes...

ValidationError is only visible in webserver_debug logs