No documentation on how to use multiple cores with XLA

aws-neuron / aws-neuron-sdk

Powering AWS purpose-built machine learning chips. Blazing fast and cost effective, natively integrated into PyTorch and TensorFlow and integrated with your favorite AWS services

https://aws.amazon.com/machine-learning/neuron/

Other

450 stars 151 forks source link

No documentation on how to use multiple cores with XLA #1004

Open greg-8090 opened 2 weeks ago

greg-8090 commented 2 weeks ago

This is the XLA documentation:

https://awsdocs-neuron.readthedocs-hosted.com/en/latest/frameworks/torch/torch-neuronx/programming-guide/training/pytorch-neuron-programming-guide.html#neuron-xla-device

And here is the relevant line: "With PyTorch NeuronX the default XLA device is mapped to a NeuronCore. By default, one NeuronCore is configured."

However, there is no guidance on how to use multiple cores.

AWSNB commented 2 weeks ago

@greg-8090 ack and we'll fix the documentation

the guide was writen before we released NxD library that is recommended as default for distributed training https://awsdocs-neuron.readthedocs-hosted.com/en/latest/libraries/nxd-training/index.html

will follow up with more explicit examples in next couple of hours

greg-8090 commented 2 weeks ago

Could you please point me to where the new examples are? Also, I notice NxD applies only to training, but I am trying to use XLA when running inference, and I want to know how to use multiple Neuron cores when doing that. Thanks in advance!

aws-amerrez commented 2 weeks ago

@greg-8090 Here is the guide to NxD Inference. The code examples show how to use multi-cores with Neuron. https://awsdocs-neuron.readthedocs-hosted.com/en/latest/libraries/neuronx-distributed/neuronx_distributed_inference_developer_guide.html