Open MehmetMHY opened 1 year ago
This issue can be solved using Ray.io. For distributed training: https://docs.ray.io/en/latest/train/train.html For distributed inference: https://docs.ray.io/en/latest/serve/index.html
Also look into: https://huggingface.co/docs/accelerate/index
can you please assign to me
Is this issue still open?
Currently LLM-VM does not support multiple GPU setups. Using runpod, I rented a setup with 2 RTX 3090 GPUs. Well running the local Bloom model example from the docs. I ran into this error:
On fix I had for this error was by adding the following code changes to
src/llm_vm/onsite_llm.py
(added the if-else statement under the "account ofr cases where model is wrapped..." comment):☝️ This change resolves the error but it does not fix the core issue. In this change we are using DataParallel.module to access the model's generate() function. In doing so, we would be bypassing the DataParallel wrapper and skipping the parallelism provided by "DataParallel". I believe this is the case and that we should implement a new solution.