If the disk fills up during the job, it's not surprising that things can fail quite badly. But then to finish off the job, we try to save the artifacts, which is also probably going to fail. Let's log as much detail about the failure as we can, salvage whatever we can to push as a result to the server, and clean it up so that we fail more gracefully if this happens.
Resolved issues
We've seen this recently with oemscript provisioned devices when hundreds of them are running at once, all trying to download a 4GB image on an agent host that has about 300GB free. We should certainly add space and spread these out a bit better too, but in the meantime this could help us handle it in the best way possible.
Documentation
N/A
Web service API changes
N/A
Tests
Tested in staging to ensure it works on the normal path (not with a full disk). Added unit tests to also simulate what will happen when saving the artifact fails because we're out of space.
Description
If the disk fills up during the job, it's not surprising that things can fail quite badly. But then to finish off the job, we try to save the artifacts, which is also probably going to fail. Let's log as much detail about the failure as we can, salvage whatever we can to push as a result to the server, and clean it up so that we fail more gracefully if this happens.
Resolved issues
We've seen this recently with oemscript provisioned devices when hundreds of them are running at once, all trying to download a 4GB image on an agent host that has about 300GB free. We should certainly add space and spread these out a bit better too, but in the meantime this could help us handle it in the best way possible.
Documentation
N/A
Web service API changes
N/A
Tests
Tested in staging to ensure it works on the normal path (not with a full disk). Added unit tests to also simulate what will happen when saving the artifact fails because we're out of space.