Open btlorch opened 1 day ago
@btlorch - In my case, I get the validation exception when I pass multiple inputs (list of ProcessingInput objects)
Thankfully, my colleague found the problem. The problem is that ml.g5.4xlarge
instances only have 600 GB of disk space. One solution is to switch to an instance with more disk space.
I'll keep this issue here as a reminder to improve the error message.
Dear Sagemaker team,
I am experiencing issues when trying to submit a Sagemaker processing job. Job submission fails with the following error:
Unfortunately the error message is empty.
After lots of trail and error, I believe that the error is related to the requested volume size. When I request a volume size of 600 GB or below, everything runs smoothly. The issue appears when I request a volume size of 700 GB or above. When I request more than 1024 GB, I receive a different error message:
If 1024 GB is my account's quota, I suppose that 700 GB should be fine and this is not a quota issue.
Is there a limit that I am not aware of? In any case, I would expect a non-empty error message.
Code to reproduce
Here is a toy example (with some placeholders).
This is the job that should be executed, process.py, a simple Python script that counts the number of JPEG files in a given directory.
This is how the job is submitted:
Expected behavior The submit script should produce the following output:
Instead, I am getting the following error message and a stack trace:
The error message is empty.
System information