ray-project / ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
33.97k stars 5.77k forks source link

[docs] improve user experience of the API ref #33645

Open angelinalg opened 1 year ago

angelinalg commented 1 year ago

Description

I wanted to capture this feedback I received on the API ref.

Looking at the parameters for Ray AIR's BatchMapper https://docs.ray.io/en/latest/ray-air/api/doc/ray.data.preprocessors.BatchMapper.html I wanted to know what batch_size was because it's pretty important when doing batch processing. The first thing I see is that batch_size will be of type Optional[Union[int, typing_extensions.Literal[default]]] = 'default') which is not very helpful to me. However, I see that it's optional, so that makes me wonder what the default value is. So I scroll down to the bottom of the page, and hidden in a long-ish paragraph is what I'm looking for: "Defaults to 4096"

visually, our API refs are hard to read image

image

cc: @maxpumperla @bveeramani @simran-2797 @emmyscode

Link

No response

justinvyu commented 1 year ago

Some more comments here: https://github.com/ray-project/ray/issues/32824

bveeramani commented 1 year ago

I think ideally we'd want something simple and readable like

batch_size: int = 4096

For context, the type hint used to be

batch_size: int | None = None

But with https://github.com/ray-project/ray/pull/29971 and Decide batch behavior for Ray AIR, we changed the type hint to

batch_size: int | "default" | None = "default"

cc @c21 @amogkam would it be possible to simplify the batch_size types? Not sure if we needed the "default" to avoid breaking changes.

c21 commented 1 year ago

Hugging Face: https://huggingface.co/docs/datasets/v2.12.0/en/package_reference/main_classes#datasets.Dataset.map Ray: https://docs.ray.io/en/latest/data/api/doc/ray.data.Dataset.map_batches.html#ray.data.Dataset.map_batches

Screen Shot 2023-05-01 at 10 16 20 AM Screen Shot 2023-05-01 at 10 17 47 AM