-
**Link to the notebook**
[pytorch_smdataparallel_mnist_demo](https://github.com/aws/amazon-sagemaker-examples/blob/main/training/distributed_training/pytorch/data_parallel/mnist/pytorch_smdataparalle…
-
I have tried to workout an end-2-end example with iot events sample,
https://github.com/aws-samples/aws-iot-events-accelerators
What I have found is the existing sample for predictive-maintenance …
-
I try to run RLHF for my previously trained Actor and Reward model. However, I encounter the following Exception:
```
Traceback (most recent call last):
File "/home/ec2-user/SageMaker/deepspeed…
-
Cog's 1) endpoint names (ex. `/predictions` in URLs) and the 2) `predict.py` entry function names (`setup()` and `predict()`) must be customizable/possible to be renamed (names only, however, currentl…
-
### Configuration
- Training script - https://github.com/huggingface/notebooks/blob/master/sagemaker/07_tensorflow_distributed_training_data_parallelism/scripts/train.py
- Launcher script - https://…
-
I would like to use a Sagemaker Model with a custom VPC Configuration, which is currently not possible with Serverless Inference. Is this feature planned? More generally: Is there a roadmap somewhere …
-
**Describe the bug**
A clear and concise description of what the bug is. This could be anything from:
https://github.com/aws-samples/amazon-bedrock-workshop/blob/main/06_OpenSource_examples/03_NVI…
-
@philschmid
Another one - followed your guide to deploy llama 3 70b
followed your guide to install llama 3 70b using aws sagemaker on inf2.48xlarge with following properties as suggested in yo…
-
### Description
Because of other deficiencies in the AWS API, I need to be enumerate the members of a group so I can find another way to add users to a resource (group assignment only works in the Ma…
-
Hello,
Same to #969
I was training a DistilBERT model on SageMaker instance using fast-bert. I am using the ml.p2.xlarge instance for GPU processing.
When the function downloads the trainin…