EMCECS / ecs-sync

ecs-sync is a bulk copy utility that can move data between various systems in parallel
Apache License 2.0
61 stars 22 forks source link

Difference in ECS S3 and S3 behaviour #23

Closed holgerjakob closed 6 years ago

holgerjakob commented 6 years ago

Hi all I know it does not make sense a lot but I figured one could maybe use an S3 target like ECS or some generic S3 to backup a Centera to.

The setup of CAS to ECS S3 saves the cdf information to ECS. Not the files themselves. When using CAS to S3 (ECS or some other S3 provider) this results in size mismatch errors.

As kind of a generic migration platform, I thought I would get your insights. Is this normal behaviour that generic S3 and ECS S3 behave differently? I would still like the idea to be able to make one last copy of the Centera content when someone moves away from the CAS protocol to a native S3 integration. An application takes all it's data and rewrites it using S3. But many of our customers would like to keep a copy of the Centera content around just in case something happens. We could then install a small system and restore to that one. After a year the bucket containing the CAS S3 backup could be deleted. This would avoid some of the CAS limitations that exist currently.

Best regards, Holger

holgerjakob commented 6 years ago

Hi all Any insight on the different behaviour or why only the metadata is being copied? Thanks, Holger

twincitiesguy commented 6 years ago

@holgerjakob, ecs-sync does not currently support migrating CAS to any protocol other than CAS. The only exceptions to this are when using the DX or CUA extraction filters (which require additional tools).

The reason for this limitation is because CAS is a multi-blob protocol, whereas every other supported protocol is uni-blob. There is no natural way to map multiple blobs into a single blob.

We are working on a way to support CAS to S3, but at first, it will only support single-blob clips (which we believe covers 90% of the clips in the wild). Stay tuned for more updates.

croaking commented 6 years ago

@holgerjakob there are commercial products on the market that can do CAS to S3. I'm biased as I work for Datadobi, but of course our founders were some of the original Centera Devs so we have a ton of experience doing it. Certainly reach out if you want to discuss it more. From a technical standpoint it comes down to 1: does the application support S3, and 2: how will you update the application's database from having the old ClipID to the new S3 object ID. We provide a csv file that you can use for that. Anyway just one approach to help get what you're looking for.