Closed Electric-Dragon closed 11 months ago
Hi @Electric-Dragon, I am having quite a similar problem. Were you able to find a solution to yours? :)
@pedro-twaice unfortunately no :/
@Electric-Dragon uhm okay!! Thanks for the reply :) In case I find any solution, I'll post it here
@pedro-twaice Okay, thanks a lot!
Hi Thanks for using SageMaker! Unfortunately this error is coming from backend training service, and not really a usability experience gap from SageMaker Python SDK. The suggestion here is to contact AWS Support to engage the Backend Service Team to get more information on the UnexpectedStatusException.
Please feel free to reopen this issue if there was an underlying SageMaker Python SDK related issue.
Describe the bug This training script runs successfully on a small dataset (~11MB) but fails on a large one (~5.4GB).
To reproduce
Training script:
Jupyter Notebook code:
Expected behavior The model is supposed to train (it trains successfully if the dataset size is small).
Logs
System information A description of your system. Please provide:
Additional context The script runs successfully on a small dataset