oracle / oci-java-sdk

Oracle Cloud Infrastructure SDK for Java
https://cloud.oracle.com/cloud-infrastructure
Other
201 stars 156 forks source link

Retries for operations that upload binary data without request-level retries do not retry in OCI Java SDK versions 3.0.0 to 3.31.0 #566

Open mricken opened 9 months ago

mricken commented 9 months ago

If you are using any of the OCI Java SDK synchronous clients that upload streams of data, e.g. ObjectStorageClient or DataSafeClient, and you do not define the RetryConfiguration at request level, your requests will not be automatically retried. However, there is no chance of silent data corruption.

Description

When using OCI Java SDK (versions 3.0.0 to 3.31.0) for operations that have retries enabled by default, and you do not define the RetryConfiguration at request level, and the request fails with a retry-able error, the OCI Java SDK should automatically retry the request. In this situation, the clients fail to reset the stream position for requests that upload streams. As a result, retries cannot be attempted, and the operation fails with a BmcException.

The upload of the stream is likely incomplete at this time and needs to be re-attempted. Fortunately, this failure is visible and therefore cannot lead to silent data corruption.

Affected requests

This happens only for synchronous clients in versions 3.0.0 to 3.31.0:

You are also affected if you use the Object Storage Upload Manager:

Operations that do not upload streams are not affected.

If the stream that is being uploaded is a ByteArrayInputStream, operations are not affected.

To summarize, you are affected if you

  1. use versions 3.0.0 to 3.31.0
  2. and upload streams using the above operations,
  3. and do not set a retry configuration at request level

Workarounds

This problem was fixed in version 3.31.1. If you are using any of the affected versions, we recommend that you upgrade to version 3.31.1 or later.

If, for some reason, you cannot upgrade to version 3.31.1 or later, here are some other possible workarounds:

**Update

We had previously stated that there may be a potential data corruption issue. Our initial fear was that the first attempt may fail, and subsequent retries fail to reset the stream and therefore do not upload the entire stream again, leading to missing data. During careful evaluation, we determined that data corruption does not occur.

Instead, the first retry fails with an exception:

Caused by: java.lang.RuntimeException: Stream {} does not support mark/reset, retries do not work

This is still far from ideal, but a hard, visible failure is preferable to silent data corruption.

**Update 2

This problem was fixed in version 3.31.1.

mricken commented 8 months ago

This problem was fixed in version 3.31.1. If you are using any of the affected versions, we recommend that you upgrade to version 3.31.1 or later.