EMCECS / ecs-sync

ecs-sync is a bulk copy utility that can move data between various systems in parallel
Apache License 2.0
61 stars 22 forks source link

After the Put md5sum mismatch #42

Closed rpulikool closed 5 years ago

rpulikool commented 5 years ago

I am using the ecs-sync to copy the files from one object storage to other and getting the below error java.lang.RuntimeException: MD5 sum mismatch (CB0B4A92FE6025C55EAC8A7990329E0F != 5EAF476670396AA3829ECC9AF05650E1) at com.emc.ecs.sync.Md5Verifier.verify(Md5Verifier.java:67) at com.emc.ecs.sync.SyncTask.run(SyncTask.java:131) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

The object has versioned files in it. Even though i was able to copy the files manually but still seeing this error when i verify the files.

twincitiesguy commented 5 years ago

This is difficult to answer without more information. I would need to see the configuration used (please scrub to remove any sensitive information or passwords) and understand the bucket configuration on both sides as well (is versioning enabled? what is the storage system? Is there any encryption involved?)

I assume you're using the AWS S3 plugin on both sides, or perhaps in combination with the ECS S3 plugin. Note that if you check the "Include Versions" box on both source and target, this will copy all versions of all objects and if any don't match in the target, it will overwrite them. If you are manually copying versions and verifying them with ecs-sync, this may not work, unless all of the versions and deletion markers are in the exact same order (ecs-sync uses an aggregated checksum to compare all of the versions at once).

rpulikool commented 5 years ago

Thanks for the reply.

Below is the snippet of the configuration. I am copying the object from ECS-S3 to S3 (Non-Amazon object storage) there is versioning enabled on source and target and no encryption involved.

SyncOptions

2018-12-23 13:17:18 INFO [sync-pool-1-t-6] AwsS3Storage: [BNDLG_17_PRODUCTION/BNDLG_17_splty_PROD/MOV_splty/XMAS SIZZLE/FINAL/BUND_Splty_XMAS_Sizzle_BG.mov]: version history differs between source and target; re-placing target version history with that from source.

2018-12-23 13:17:27 INFO [sync-pool-1-t-6] EcsSync: O--R object BNDLG_17_PRODUCTION/BNDLG_17_splty_PROD/MOV_splty/XMAS SIZZLE/FINAL/BUND_Splty_XMAS_Sizzle_BG.mov failed 1 time (queuing for retry) com.amazonaws.SdkClientException: Upload canceled at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:159) at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:47) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

2018-12-23 13:17:34 INFO [sync-pool-1-t-2] EcsSync: O--R object BNDLG_17_PRODUCTION/BNDLG_17_splty_PROD/MOV_splty/XMAS SIZZLE/FINAL/BUND_Splty_XMAS_Sizzle_BG.mov failed 2 times (queuing for retry) com.amazonaws.SdkClientException: Upload canceled at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:159) at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:47) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

2018-12-23 13:17:42 WARN [sync-pool-1-t-10] SyncTask: O--! object BNDLG_17_PRODUCTION/BNDLG_17_splty_PROD/MOV_splty/XMAS SIZZLE/FINAL/BUND_Splty_XMAS_Sizzle_BG.mov failed com.amazonaws.SdkClientException: Upload canceled at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:159) at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:47) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

on the Source there is object and a deleted object which is showing in versioning

rpulikool commented 5 years ago

Also found one more issue with copying when there is "%" in the filename the ecs-sync is not able to copy that file. Can you please also check that

twincitiesguy commented 5 years ago

Unfortunately, there is not enough information here to determine what the problem is. It looks like the upload was cancelled for some reason, but there is no additional info coming from the AWS SDK. You might try using the ECS S3 plugin for the target and turn off the smart client. That client should provide more info as to what exactly failed.

twincitiesguy commented 5 years ago

As for the percent in the filename, you should open a separate issue for this and include the (scrubbed) log. You might try using legacy signatures in the AWS plugin, or switching to the ECS S3 plugin as mentioned above.

twincitiesguy commented 5 years ago

Please try this with 3.3; there were some fixes made with key decoding and version ordering that may affect this use case. Reopen this if the problem remains