elementor / wp2static-addon-s3

S3 deployment Add-on for WP2Static
The Unlicense
35 stars 23 forks source link

Retry logic and more error reporting #41

Open petewilcock opened 3 years ago

petewilcock commented 3 years ago

Having real struggles with the addon at the moment... deploy job keeps stuttering to a halt, but wp2static shows it as processing forever. Nothing in wp2s logs or php logs, but deleting and re-adding the deploy job will eventually process everything - so it can't be a fatal block - but I think some retry logic or more verbose error logging would really help pin this down - currently I'm at a loss at to the ultimate cause.

leonstafford commented 3 years ago

How are you triggering the jobs in this case?

I'd expect WP_CLI running would throw some errors that aren't surfacing via web. Sometimes, I've found timeout logs in web server/load balancer/system logs.

We could really use some better job handling like in SimplerStatic's WP Async Task usage, but even some issues with that.

Definitely need that kind of thing sorted before re-attemtping async/parallel crawling

petewilcock commented 3 years ago

Sadly don't have SSH into my Fargate container so can't trigger CLI via an actual command line - so it's all via the UI.

Just experimenting with some try/catch around the S3->Put:

try {
                $result = $s3->putObject( $put_data );

            if ( $result['@metadata']['statusCode'] === 200 ) {
                \WP2Static\DeployCache::addFile( $cache_key, $namespace, $hash );

                if ( $cf_max_paths >= count( $cf_stale_paths ) ) {
                    $cf_key = $cache_key;
                    if ( 0 === substr_compare( $cf_key, '/index.html', -11 ) ) {
                        $cf_key = substr( $cf_key, 0, -10 );
                    }
                    $cf_key = str_replace( ' ', '%20', $cf_key );
                    array_push( $cf_stale_paths, $cf_key );
                }
            }
            } catch (S3Exception $e) {
                \WP2Static\WsLog::l( $e->getMessage() );
            }

Should at least log out S3 errors and continue instead of silent death. Of course since implementing that I haven't seen it die since :( I'm keen to see about implementing s3 sync as an alternative to 'reupload everything not in cache' - although just reading here I see v3 of the PHP SDK changes the behaviour and just uploads everything.... will keep looking into it.

john-shaffer commented 3 years ago

We should definitely catch the exception. We can do some basic backoff-retry in the case of network errors, because failing outright can be messy.

In the case that we do fail entirely, how do we want to handle that? We can't leave the task "processing", but marking it "complete" is misleading, Should we introduce a "failed" state?

I think all sync does is compare modified times and MD5 hashes and upload any mismatches. It should be pretty straightforward to implement. You could probably leave the existing deploy logic mostly as-is, but before it runs list your current objects and update the deploy cache with that data. Metadata and redirects might be trickier, as the current implementation includes those in the hash but IIRC S3 only includes the object body. You might need another column in the deploy cache table.