BMCV / galaxy-image-analysis

Galaxy tools for image analysis
MIT License
14 stars 14 forks source link

adds tool for superdsm (without ray patches) #62

Closed kostrykin closed 1 year ago

kostrykin commented 1 year ago

FOR CONTRIBUTOR:


The error I get when running planemo test is the following:

Job in error state.. tool_id: ip_superdsm, exit_code: 1, stderr: 2023-06-21 00:18:53,303    INFO worker.py:1544 -- Started a local Ray instance. View the dashboard at 127.0.0.1:8265 
(raylet) [2023-06-21 00:19:01,791 E 1327542 1327573] (raylet) agent_manager.cc:135: The raylet exited immediately because the Ray agent failed. The raylet fate shares with the agent. This can happen because the Ray agent was unexpectedly killed or failed. Agent can fail when
(raylet) - The version of `grpcio` doesn't follow Ray's requirement. Agent can segfault with the incorrect `grpcio` version. Check the grpcio version `pip freeze | grep grpcio`.
(raylet) - The agent failed to start because of unexpected error or port conflict. Read the log `cat /tmp/ray/session_latest/dashboard_agent.log`. You can find the log file structure here https://docs.ray.io/en/master/ray-observability/ray-logging.html#logging-directory-structure.
(raylet) - The agent is killed by the OS (e.g., out of memory).
Traceback (most recent call last):
  File "/home/void/Documents/galaxy-image-analysis/tools/superdsm/run-superdsm.py", line 41, in <module>
    data, cfg, _ = superdsm.automation.process_image(pipeline, cfg, img)
  File "/home/void/miniconda3/envs/mulled-v1-f473b96c9754fd351c3d4e956e80fa4d198ea0b04c54321dd7854a4fc50429b5/lib/python3.9/site-packages/superdsm/automation.py", line 117, in process_image
    return pipeline.process_image(g_raw, cfg=cfg, **kwargs)
  File "/home/void/miniconda3/envs/mulled-v1-f473b96c9754fd351c3d4e956e80fa4d198ea0b04c54321dd7854a4fc50429b5/lib/python3.9/site-packages/superdsm/pipeline.py", line 172, in process_image
    dt = stage(data, cfg, out=out, log_root_dir=log_root_dir)
  File "/home/void/miniconda3/envs/mulled-v1-f473b96c9754fd351c3d4e956e80fa4d198ea0b04c54321dd7854a4fc50429b5/lib/python3.9/site-packages/superdsm/pipeline.py", line 61, in __call__
    output_data = self.process(input_data, cfg=cfg, out=out, log_root_dir=log_root_dir)
  File "/home/void/miniconda3/envs/mulled-v1-f473b96c9754fd351c3d4e956e80fa4d198ea0b04c54321dd7854a4fc50429b5/lib/python3.9/site-packages/superdsm/c2freganal.py", line 148, in process
    y_id = ray.put(y)
  File "/home/void/miniconda3/envs/mulled-v1-f473b96c9754fd351c3d4e956e80fa4d198ea0b04c54321dd7854a4fc50429b5/lib/python3.9/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
    return func(*args, **kwargs)
  File "/home/void/miniconda3/envs/mulled-v1-f473b96c9754fd351c3d4e956e80fa4d198ea0b04c54321dd7854a4fc50429b5/lib/python3.9/site-packages/ray/_private/worker.py", line 2452, in put
    object_ref = worker.put_object(value, owner_address=serialize_owner_address)
  File "/home/void/miniconda3/envs/mulled-v1-f473b96c9754fd351c3d4e956e80fa4d198ea0b04c54321dd7854a4fc50429b5/lib/python3.9/site-packages/ray/_private/worker.py", line 621, in put_object
    self.core_worker.put_serialized_object_and_increment_local_ref(
  File "python/ray/_raylet.pyx", line 1780, in ray._raylet.CoreWorker.put_serialized_object_and_increment_local_ref
  File "python/ray/_raylet.pyx", line 1669, in ray._raylet.CoreWorker._create_put_buffer
  File "python/ray/_raylet.pyx", line 209, in ray._raylet.check_status
ray.exceptions.RaySystemError: System error: Broken pipe

I suspect memory issues to be the cause due to the "The agent is killed by the OS (e.g., out of memory)" hint and since the error occurs when calling ray.put, which is supposed to allocate inter-process shared memory.