trainium Search Results

89 results
for trainium

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

huggingface/transformers #29042

Neuron Trainium --Gradient_Accumulation_Steps > 1

### System Info If I use Optimum Neuron on Trainium with --gradient_accumulation_steps > 1 and training failed, Then I modified line https://github.com/huggingface/transformers/blob/6d1f545665ac6…

mathephysicist updated 3 months ago
8
aws/deep-learning-containers #2732

[feature-request] Programmatic way to get image URLs

Checklist - [x] I've prepended issue tag with type of change: [feature] - [ ] (If applicable) I've documented below the DLC image/dockerfile this relates to - [ ] (If applicable) I've documented th…

kiukchung updated 6 months ago
5
dstackai/dstack #1116

[Roadmap] Q2 2024

This issue outlines the major items planned for Q2 2024. Note that it doesn't include bug fixes, except for major issues. > [!NOTE] > **Bold** means priority. ### Core features - [x] **Multi-…

peterschmidt85 updated 1 week ago
1
NVIDIA/NeMo #8409

NEMO on XLA

We need to train NEMO models on specialised hardware XLA/Trainium. Are you planning to make this framework XLA compatible?

dennj updated 3 months ago
2
awslabs/data-on-eks #478

Inf2 Worker Node Groups has multiple taints

## Description Please provide a clear and concise description of the issue you are encountering, and a reproduction of your configuration. In the node group [Terraform file](https://github.com/a…

lindarr915 updated 3 months ago
1
aws/aws-cdk #29131

(aws-eks): Neuron device plugin is not installed when instan…

### Describe the bug if instance type is Trainium the neuron device plugin is wrongfully not installed ### Expected Behavior if instance type is Trainium the neuron device plugin is installed ### …

freschri updated 3 months ago
2
aws-neuron/transformers-neuronx #27

gpt2_demo @ d3f6e49 (latest) breaks

I usually install `transformers_neuronx` from git, and since the last commit says that it was updated for SDK release 2.12, I assumed it was the same version available from GitHub. However, running `g…

supersat updated 1 day ago
3
awslabs/data-on-eks #454

GradioUI App as a container deployment for trainium-inferent…

vara-bonthu updated 3 months ago
2
pytorch/pytorch #93173

[RFC] PT2-Friendly Traceable, Functional Collective Communic…

### 🚀 Traceable Collectives! Collective APIs (e.g. all_reduce, all_gather, ...) are used in distributed PyTorch programs, but do not compose cleanly with compilers. Specifically, torchDynamo a…

wconstab updated 1 month ago
40
flexflow/FlexFlow #1239

Support for XLA based devices

Does FlexFlow have the capability to support XLA based devices (e.g. TPU, Trainium) or is it tied to Cuda ?

mmcclean-aws updated 3 months ago
1

上一页 1...1 2 3 4 5 6 7...9 下一页

89 results for trainium

89 results
for trainium