ludwig-ai / ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models
http://ludwig.ai
Apache License 2.0
11.22k stars 1.19k forks source link

Support for versions of Ray 2.4+ #3806

Open chainlink opened 12 months ago

chainlink commented 12 months ago

Is your feature request related to a problem? Please describe. Hi there, just curious when you're planning support more recent versions of Ray. It looks like the ray version has been explicitly limited in the code.

chainlink commented 12 months ago

It looks like this is being worked on

Could you give me an idea of priority / when this would land?

arnavgarg1 commented 12 months ago

Hi @chainlink! Thanks for raising this issue. You are right that this has been explicitly limited in Ludwig code to be Ray 2.3.1 or lower.

The reason we haven't supported further versions of Ray is that Ray's APIs were very unstable between Ray 2.1 and 2.4, and it took quite a few engineering hours to make each Ray release compatible with Ludwig. This is because Ludwig makes use of 4 Ray libraries - Ray core, Ray Data, Ray Train, and Ray Tune, all of which have changing APIs that need to be updated. We were waiting for their APIs to stabilize, which may now be the case.

I'm happy to pick this up in the coming weeks, but would be curious to know if there are certain features or enhancements that we should be aware of that would make it more compelling to upgrade to Ray 2.8.1 sooner rather than later from your perspective?

chainlink commented 11 months ago

Makes sense! Maintaining integrations with a remote product is always a tricky balance. My biggest ask would be being able to schedule a ludwig training run in a ray job via the CLI (vs using the old direct attach method)

ie.

ray job submit --address="http://10.10.10.10:8265" --runtime-env-json='{"working_dir": ".", "pip": "requirements.txt" }' --entrypoint-num-gpus 1 -- ludwig train --config model.yaml --dataset "ludwig://alpaca"