archiecobbs / s3backer

FUSE/NBD single file backing store via Amazon S3
Other
529 stars 75 forks source link

Feature: treat 1st level folders as S3 prefixes (prefix automation) #166

Closed solracsf closed 2 years ago

solracsf commented 2 years ago

As one knows, split files between prefixes is a good practice: https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimizing-performance.html

This is already possible in s3backer thanks to the --prefix option, or by (recommended) specifying bucket mybucket/foo/bar. But this has to be a manual process, and one must create several mounts (correct me if i'm wrong), one per prefix.

Instead, my proposal is some kind of prefix automation:

1 / by default, keep the same behavior as described above

2 / introduce a new option like --autoFirstLevelPrefix that would activate the following: treat every folder at the 1st level of the filesystem mount (suppose the mount is /mnt) as a prefix; so, in case of /mnt/photos, /mnt/movies, /mnt/docs these would be passed in the requests to S3 as prefixes: https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-prefixes.html

I can't really evaluate the necessary work to accomplish this, if this is something that sounds odd or not, but in large filesystems, this (or a better one) prefix automation would definitely be a very good option to optimize AWS S3 performance.

The reason is one can send 3,500 PUT/COPY/POST/DELETE or 5,500 GET/HEAD requests per second per prefix in an Amazon S3 bucket. There are no limits to the number of prefixes that one can have in any bucket. https://aws.amazon.com/premiumsupport/knowledge-center/s3-request-limit-avoid-throttling/

archiecobbs commented 2 years ago

Just my opinion here...

I think this is out of scope for s3backer.

Besides, this is something that could easily be scripted, e.g.:

#!/bin/bash

# Mount every subdir of $1 via s3backer
find "${1}" -maxdepth 1 -type d -print | while read DIR; do
    PREFIX=`basename "${DIR}"`
    s3backer -F /etc/s3backer.conf mybucket/"${PREFIX}" "${DIR}" 
done
solracsf commented 2 years ago

Yes but this requires several mounts, so imagine 20+ mounts each one requiring its own resources, manage each mount...this could be a lot, hence my suggestion where one and only instance would manage it, like one can do programmatically with some PHP libraries for prefixes or even multi-bucket distribution.

archiecobbs commented 2 years ago

OK it sounds like two different things are being conflated here...

Correct me if I'm wrong: The main problem you are trying to solve is to get around the limit of 3,500 PUT/COPY/POST/DELETE or 5,500 GET/HEAD requests per second.

OK, so one way to do that is to spread s3backer's data over multiple prefixes in the same bucket.

One way to do that is this "prefix automation" idea which you are suggesting.

Fine.. but there are probably much simpler ways.

Let's go back to the original problem: the limits on requests per second.

We already have a similar "spreading" mechanism in place to reduce contention, namely the --blockHashPrefix flag.

Today the --blockHashPrefix flag prepends a "random" 8 digit hex prefix like 3e09fab2- to the block number, so the full object name ends up being something like 3e09fab2-00000001.

If instead of using a dash to separate the hash and the block number, it used a slash, then this would spread the blocks across a bunch of different prefixes. E.g. the full object name would be something like 3e09fab2/00000001.

This would be an incompatible change and so would require a new flag.

But if you'd like to just test this, try checking out the master branch and then applying this patch:

diff --git a/http_io.c b/http_io.c
index fe96e23..344497e 100644
--- a/http_io.c
+++ b/http_io.c
@@ -96,7 +96,7 @@
                                       + S3B_BLOCK_NUM_DIGITS + 2)

 /* Separator string used when "--blockHashPrefix" is in effect */
-#define BLOCK_HASH_PREFIX_SEPARATOR "-"
+#define BLOCK_HASH_PREFIX_SEPARATOR "/"

 /* Bucket listing API constants */
 #define LIST_PARAM_MARKER           "marker"
solracsf commented 2 years ago

Awesome! I'll give it a try, but Delimiter and Prefix should be passed along the query, is this the case?

Because there are folders and prefixes but only prefixes allow the limit bypass.

See https://aws.amazon.com/premiumsupport/knowledge-center/s3-prefix-nested-folders-difference/

Note: The folder structure might not indicate any partitioned prefixes that support request rates.

Actually, the patch produces folders i believe (i'm not 100% sure on this, just "reading" the S3 interface...)

image

archiecobbs commented 2 years ago

The Delimiter and Prefix parameters are only used for listing queries. I think it should work.

"Folders" are just an illusion. S3 only stores objects in a flat namespace. But the Amazon web console treats the / character in an object name as if it were special and creates "Folders" out of thin air. And S3 now also apparently applies traffic limits based on the which /-based subtree the object is in.

solracsf commented 2 years ago

Ok, so, in your opinion, the patch at https://github.com/archiecobbs/s3backer/issues/166#issuecomment-1025231802 just works out of the box? Screenshot above has been produced after the patch was installed, in a dedicated bucket.

archiecobbs commented 2 years ago

Ok, so, in your opinion, the patch at #166 (comment) just works out of the box? Screenshot above has been produced after the patch was installed, in a dedicated bucket.

I would say yes it is supposed to just work out of the box, based on my understanding of how Amazon is applying their traffic limits.

But it's still an untested theory at this point...

solracsf commented 2 years ago

When you read this, you can understand why this seems confusing 😆 https://stackoverflow.com/questions/52443839/s3-what-exactly-is-a-prefix-and-what-ratelimits-apply