sebastianfeldmann / phpbu

PHP Backup Utility - Creates and encrypts database and file backups, syncs your backups to other servers or cloud services and assists you monitor your backup process
https://phpbu.de
Other
1.3k stars 110 forks source link

Retry Uploads when one multipart failed #321

Open planetahuevo opened 2 years ago

planetahuevo commented 2 years ago

I am getting some errors with big uploads:

Exception 'phpbu\App\Backup\Sync\Exception' with message 'An exception occurred while uploading parts to a multipart upload. The following parts had errors:
    - Part 249: Error executing "UploadPart" on "https://s3.eu-central-003.backblazeb2.com/filenameedited.gz?partNumber=249&uploadId=editedc003_v0312018_t0025_u01663705241225"; AWS HTTP error: Server error: `PUT https://s3.eu-central-003.backblazeb2.com/filenameedited.gz?partNumber=249&uploadId=editedc003_v0312018_t0025_u01663705241225` resulted in a `500 Internal Server Error` response:
    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <Error>
        <Code>InternalError</Code>
        <Message>An internal  (truncated...)
     InternalError (server): An internal error occurred.  Please retry your upload. - <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <Error>
        <Code>InternalError</Code>
        <Message>An internal error occurred.  Please retry your upload.</Message>
    </Error>

    - Part 253: Error executing "UploadPart" on "https://s3.eu-central-003.backblazeb2.com/filenameedited.gz?partNumber=253&uploadId=editedc003_v0312018_t0025_u01663705241225"; AWS HTTP error: Server error: `PUT https://s3.eu-central-003.backblazeb2.com/filenameedited.gz?partNumber=253&uploadId=editedc003_v0312018_t0025_u01663705241225` resulted in a `500 Internal Server Error` response:
    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <Error>
        <Code>InternalError</Code>
        <Message>An internal  (truncated...)
     InternalError (server): An internal error occurred.  Please retry your upload. - <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <Error>
        <Code>InternalError</Code>
        <Message>An internal error occurred.  Please retry your upload.</Message>
    </Error>

    - Part 271: Error executing "UploadPart" on "https://s3.eu-central-003.backblazeb2.com/filenameedited.gz?partNumber=271&uploadId=editedc003_v0312018_t0025_u01663705241225"; AWS HTTP error: Server error: `PUT https://s3.eu-central-003.backblazeb2.com/filenameedited.gz?partNumber=271&uploadId=editedc003_v0312018_t0025_u01663705241225` resulted in a `500 Internal Server Error` response:
    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <Error>
        <Code>InternalError</Code>
        <Message>An internal  (truncated...)
     InternalError (server): An internal error occurred.  Please retry your upload. - <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <Error>
        <Code>InternalError</Code>
        <Message>An internal error occurred.  Please retry your upload.</Message>
    </Error>

    - Part 297: Error executing "UploadPart" on "https://s3.eu-central-003.backblazeb2.com/filenameedited.gz?partNumber=297&uploadId=editedc003_v0312018_t0025_u01663705241225"; AWS HTTP error: Server error: `PUT https://s3.eu-central-003.backblazeb2.com/filenameedited.gz?partNumber=297&uploadId=editedc003_v0312018_t0025_u01663705241225` resulted in a `500 Internal Server Error` response:
    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <Error>
        <Code>InternalError</Code>
        <Message>An internal  (truncated...)
     InternalError (server): An internal error occurred.  Please retry your upload. - <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <Error>
        <Code>InternalError</Code>
        <Message>An internal error occurred.  Please retry your upload.</Message>
    </Error>

    - Part 333: Error executing "UploadPart" on "https://s3.eu-central-003.backblazeb2.com/filenameedited.gz?partNumber=333&uploadId=editedc003_v0312018_t0025_u01663705241225"; AWS HTTP error: Server error: `PUT https://s3.eu-central-003.backblazeb2.com/filenameedited.gz?partNumber=333&uploadId=editedc003_v0312018_t0025_u01663705241225` resulted in a `500 Internal Server Error` response:
    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <Error>
        <Code>InternalError</Code>
        <Message>An internal  (truncated...)
     InternalError (server): An internal error occurred.  Please retry your upload. - <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <Error>
        <Code>InternalError</Code>
        <Message>An internal error occurred.  Please retry your upload.</Message>
    </Error>

    - Part 377: Error executing "UploadPart" on "https://s3.eu-central-003.backblazeb2.com/filenameedited.gz?partNumber=377&uploadId=editedc003_v0312018_t0025_u01663705241225"; AWS HTTP error: Server error: `PUT https://s3.eu-central-003.backblazeb2.com/filenameedited.gz?partNumber=377&uploadId=editedc003_v0312018_t0025_u01663705241225` resulted in a `503 Service Unavailable` response:
    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <Error>
        <Code>ServiceUnavailable</Code>
        <Message>no tome (truncated...)
     ServiceUnavailable (server): no tomes available - <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <Error>
        <Code>ServiceUnavailable</Code>
        <Message>no tomes available</Message>
    </Error>

    '
    in phar:///usr/local/bin/phpbu/Backup/Sync/AmazonS3v3.php:72

I think this is not phpbu fault, but it would be great if we can retry the uploads when the error is 503 or 500. At the moment, the whole backup fails because of this, and with a retry after 5-10 seconds I am sure it will work again.

I think this is critical to work with medium-big size backups. In my case this was a 3GB backup, so not that huge.

planetahuevo commented 2 years ago

Related question. Will multipart upload be automatically selected when the file is bigger than 5GB? What is the size of the parts?

planetahuevo commented 2 years ago

This is what b2 recommends: https://www.backblaze.com/blog/b2-503-500-server-error/ And this is S3: https://aws.amazon.com/premiumsupport/knowledge-center/http-5xx-errors-s3/#:~:text=The%20error%20code%20500%20Internal,high%2C%20exceeding%20the%20request%20rate.

I think customers using the S3 API won't have to worry about this discussion. :-)

When using the S3 API Backblaze will not return that particular error code for that situation (asking customer code to get a new upload URL, since Amazon S3 has no other upload URL, the client side could not possibly "fix" anything, so the Backblaze S3 layer doesn't have the ability to return this particular HTTP response code, correct). It is an anomaly of implementing the lower cost (to Backblaze) B2 API that didn't have the load balancer layer in it.

This is a side note (doesn't affect the answer): I am kind of concerned about the quality of client side customer code out there that cannot handle intermittent failures. The most conservative, best code uploading to Amazon S3 in the Amazon S3 datacenters (Backblaze not involved in any way, shape, or form) has to be able to handle rare failures where it retries the transmission when it gets an HTTP response that isn't a "200 - ok". And the difference between "rare" and "common" escapes me from a code standpoint. Put differently, it is more terrifying to me that the code MIGHT fail or corrupt data if the network experiences over 1% network failure than if the code always fails or always works. I hope that made sense.

I guess what I'm worried about is this scenario: some program out there not handling any errors at all in their upload code, and what they are banking on is that if tonight's backup is totally corrupted due to 1 network error, that tomorrow's backup will work because it will have zero network errors. And then what occurs is a bad network cable causing 50 network errors per day prevents them from ever getting an uncorrupted backup. I hope that made sense as a concern.

planetahuevo commented 2 years ago

@sebastianfeldmann any ideas?