ray-project / ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
33.52k stars 5.69k forks source link

[Data] Arguments for specifying ray_remote_args are not consistent among the APIs #40673

Open raulchen opened 12 months ago

raulchen commented 12 months ago

For example, to specify num_cpus, now we use map(num_cpus=1) for the map API, but use read_image(ray_remote_args={"num_cpus": 1}) for the read APIs.

rbavery commented 3 months ago

yeah I get this strange error when using ray_remote_args in map. it says num_cpus should be accepted but it is not

            ray_dataset = ray.data.from_items(pandas_batch.to_list()).map(
                deserialize_and_convert,  ray_remote_args={"num_cpus": 0.25}
            ) 
  File "/home/jovyan/work/src/wherobots/inference/engine/inference.py", line 220, in predict
    for batch in ray_dataset.iter_batches(batch_size=batch_size):
  File "/opt/conda/lib/python3.10/site-packages/ray/data/iterator.py", line 161, in _create_iterator
    block_iterator, stats, blocks_owned_by_consumer = self._to_block_iterator()
  File "/opt/conda/lib/python3.10/site-packages/ray/data/_internal/iterator/iterator_impl.py", line 33, in _to_block_iterator
    block_iterator, stats, executor = ds._plan.execute_to_iterator()
  File "/opt/conda/lib/python3.10/site-packages/ray/data/exceptions.py", line 86, in handle_trace
    raise e.with_traceback(None) from SystemException()
ValueError: Invalid option keyword ray_remote_args for remote functions. Valid ones are ['accelerator_type', 'memory', 'name', 'num_cpus', 'num_gpus', 'object_store_memory', 'placement_group', 'placement_group_bundle_index', 'placement_group_capture_child_tasks', 'resources', 'runtime_env', 'scheduling_strategy', '_metadata', 'enable_task_events', 'max_calls', 'max_retries', 'num_returns', 'retry_exceptions', '_generator_backpressure_num_objects'].