juicedata / juicefs

JuiceFS is a distributed POSIX file system built on top of Redis and S3.
https://juicefs.com
Apache License 2.0
10.96k stars 968 forks source link

Metadata backup failed on large volume #5276

Open polyrabbit opened 3 weeks ago

polyrabbit commented 3 weeks ago

We have a volume with 500M+ inodes, the metadata backup always fails with the following error:

2024/11/04 18:02:41.794812 juicefs[43] <WARNING>: backup metadata failed: GC life time is shorter than transaction duration, transaction starts at 2024-11-04 17:46:12.324 +0800 CST, GC safe point is 2024-11-04 17:52:34.174 +0800 CST [backup.go:84]
dongjiang1989 commented 3 weeks ago

Which metadata system is used? tikv,redis or sql?

polyrabbit commented 3 weeks ago

Tikv, I suppose GC safe point appears a lot in tikv engine.

davies commented 2 weeks ago

@polyrabbit Can you try #5080?

polyrabbit commented 2 weeks ago

Unfortunately #5080 still fails with:

2024/11/08 11:01:39.893074 juicefs[50149] <FATAL>: GC life time is shorter than transaction duration, transaction starts at 2024-11-08 10:50:43.874 +0800 CST, GC safe point is 2024-11-08 10:51:34.174 +0800 CST [main.go:31]

But this time it runs longer (13min+) than before, I suppose there is another txn opened too long?

polyrabbit commented 2 weeks ago

Update: a second test works now, the progress shows it needs 10+ hours to finish, I'll wait to see if it succeeds tomorrow.

The difference between those two tests is that I rebased #5080 this morning, and the second test is I cherry-picked #5080 - I suppose there are some conflicts between those commits.

Also, I noticed backup spends lots of time on sorting large dirs: https://github.com/juicedata/juicefs/blob/c90a175d323a4f4593b3eaff7c750d3611417b28/pkg/meta/tkv.go#L2897 Is it necessary?

image
polyrabbit commented 2 weeks ago

It took 7h+ to backup 318039153 files.

davies commented 1 week ago

We are working on a faster dump into binary format, will let you know when it's ready

polyrabbit commented 1 week ago

Why not consider merging #5080? Does it have any critical drawbacks? I suppose stream scan also benefits other cases.