Open greg-8090 opened 2 weeks ago
@greg-8090 ack and we'll fix the documentation
the guide was writen before we released NxD library that is recommended as default for distributed training https://awsdocs-neuron.readthedocs-hosted.com/en/latest/libraries/nxd-training/index.html
will follow up with more explicit examples in next couple of hours
Could you please point me to where the new examples are? Also, I notice NxD applies only to training, but I am trying to use XLA when running inference, and I want to know how to use multiple Neuron cores when doing that. Thanks in advance!
@greg-8090 Here is the guide to NxD Inference. The code examples show how to use multi-cores with Neuron. https://awsdocs-neuron.readthedocs-hosted.com/en/latest/libraries/neuronx-distributed/neuronx_distributed_inference_developer_guide.html
This is the XLA documentation:
https://awsdocs-neuron.readthedocs-hosted.com/en/latest/frameworks/torch/torch-neuronx/programming-guide/training/pytorch-neuron-programming-guide.html#neuron-xla-device
And here is the relevant line: "With PyTorch NeuronX the default XLA device is mapped to a NeuronCore. By default, one NeuronCore is configured."
However, there is no guidance on how to use multiple cores.