ray-project / ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
33.25k stars 5.63k forks source link

[AIR][Data] Remove references to `TENSOR_COLUMN_NAME` #37047

Open bveeramani opened 1 year ago

bveeramani commented 1 year ago

Context

Previously, Ray Data had a concept of "tensor datasets". These datasets represented a collection of arrays, and internally they'd have a single column named TENSOR_COLUMN_NAME. #36472 removed "tensor datasets", so code paths with TENSOR_COLUMN_NAME are dead code.

For example, the following code path is never accessed: https://github.com/ray-project/ray/blob/ac588a8132d08298ac91fb1a7c2f514fd18c636f/python/ray/train/xgboost/xgboost_predictor.py#L134-L139

Bhav00 commented 1 year ago

Hey, can this issue be assigned to me? If yes, do all the code paths referencing TENSOR_COLUMN_NAME need to be removed altogether from the code base?

bveeramani commented 1 year ago

@Bhav00 for sure! Tag me for a review once you have a PR up.

If yes, do all the code paths referencing TENSOR_COLUMN_NAME need to be removed altogether from the code base?

That's right.

anyscalesam commented 5 months ago

There are still a bunch of occurences of this VAR > @Bhav00 can you double click and cut the PR here? https://github.com/search?q=repo%3Aray-project%2Fray%20TENSOR_COLUMN_NAME&type=code