clyso / chorus

s3 multi provider data lifecycle management
Apache License 2.0
59 stars 6 forks source link

Command to check an already synced bucket? #38

Open PC-Admin opened 3 months ago

PC-Admin commented 3 months ago

Not sure how to get this working, but trying to check if one of my buckets was synced. (In dash it does seem to be)

But I am running into the following errors:

mcollins1@storage-13-09002:~/chorus$ ./tools/chorctl/chorctl --address=storage-13-09004:9670 check main follower -b test-bucket5
Checking files in bucket test-bucket5 ...
🪣 BUCKET       | Match  | MissSrc       | MissDst       | Differ        | Error
FATA[0000] unable to check bucket                        error="rpc error: code = InvalidArgument desc = InvalidArg"
mcollins1@storage-13-09002:~/chorus$ ./tools/chorctl/chorctl --address=storage-13-09004:9670 check main follower --check-bucket test-bucket5
Checking files in bucket test-bucket5 ...
🪣 BUCKET       | Match  | MissSrc       | MissDst       | Differ        | Error
FATA[0000] unable to check bucket                        error="rpc error: code = InvalidArgument desc = InvalidArg"

What's the correct syntax I'm looking for here?

PC-Admin commented 3 months ago

Specifying the user account as well seems to get me a little further, although it times out:

mcollins1@storage-13-09002:~/chorus$ ./tools/chorctl/chorctl --address=storage-13-09004:9670 check main follower --check-bucket test-user:test-bucket5
Checking files in bucket test-user:test-bucket5 ...
🪣 BUCKET                 | Match        | MissSrc       | MissDst       | Differ        | Error
FATA[0020] unable to check bucket                        error="rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing: dial tcp 146.118.58.214:9670: i/o timeout\""
arttor commented 3 months ago

probably the bucket is too big and chorus was not able to list all objects in time.

chorctl check is just a wrapper around rclone check command. I think it was bad idea to add this command to chorctl. I will have to rewrite it or remove it because rclone check consumes a lot of RAM and takes a lot of time for big bucket and can fail chorus worker instance.

Please try to use rclone check directly instead of chorctl check to avoid timeout error.

PC-Admin commented 3 months ago

Thanks for getting back to me so quickly. It's strange as these are quite small buckets with only 50-100 objects in them...

I've created a list of "different" buckets using rclone check but I'm now wondering. What is the most efficient way to "continue" replication on these buckets?

When I try to re-add a user level rule it seems to only replicate buckets that are new, but not buckets that have new objects in them:

mcollins1@storage-13-09002:~/chorus$ ./tools/chorctl/chorctl --address=storage-13-09004:9670 repl delete-user -u test-user -f main -t follower
mcollins1@storage-13-09002:~/chorus$ ./tools/chorctl/chorctl --address=storage-13-09004:9670 repl add-user -u test-user -f main -t follower
mcollins1@storage-13-09002:~/chorus$ ./tools/chorctl/chorctl --address=storage-13-09004:9670 repl
NAME                                      PROGRESS                 SIZE                  OBJECTS     EVENTS     PAUSED     LAG             AGE
...
test-user:test-bucket3:main->follower     [##########] 100.0 %     4.9 GiB/4.9 GiB       50/50       0/0        false      11.607007ms     23h9m

Here we see test-bucket3, which I've added 50x objects too (making 100 in total), doesn't actually get updated this way. test-bucket7, which was new, did get updated however.

When I try to re-add a bucket level rule, it seems to start from scratch and re-transmit every single object in that bucket. :S

mcollins1@storage-13-09002:~/chorus$ ./tools/chorctl/chorctl --address=storage-13-09004:9670 repl delete -u test-user -b "test-bucket4" -f main -t follower
mcollins1@storage-13-09002:~/chorus$ ./tools/chorctl/chorctl --address=storage-13-09004:9670 repl add -u test-user -b "test-bucket4" -f main -t follower
mcollins1@storage-13-09002:~/chorus$ ./tools/chorctl/chorctl --address=storage-13-09004:9670 repl
NAME                                      PROGRESS                 SIZE                  OBJECTS     EVENTS     PAUSED     LAG             AGE
test-user:test-bucket4:main->follower     [#         ]  19.1 %     1.9 GiB/9.7 GiB       19/100      0/0        false      45.017207ms     7s
...

Here we see it's starting again with test-bucket4, which already had 50x objects in it that I would have liked to skip...

arttor commented 3 months ago

sorry i didn't get your question. but when you start replication chorus will list all objects from source. then it will try to sync each object to destination. if object is already exists in destination and the same size and etag, then object will not be copied.

PC-Admin commented 3 months ago

That's good to know, thank you. I'll leave this one open for when you manage to remove chorctl check.