how to use IO500 against S3

Frankgad commented 2 years ago

Hello, This PR adds the script needed to prepare all the necessary packages to run the IO500 against S3 compatible storage. A sample config-s3.ini and a readme-s3.md are also provided for more info.

seattleplus commented 2 years ago

What does it mean to run io500 on S3 when the s3 protocol doesn't support shared write to a single file? Hoes are the shared file update benchmarks being implemented?

JulianKunkel commented 2 years ago

It depends on the underlying library. The shared file can be implemented using "multipart upload", i.e., each process uploads a part and then S3 recreates a single object from the pieces stitching them together (using IDs). The alternative libs3 implementation is used for benchmarking purpose and implements each write as an independent object in a single bucket.

Message ID: @.***>

seattleplus commented 2 years ago

In the MPU case, how many objects are viewable at the end of the run? As far as I know, MPU only works from a single client, but it would be interesting if distributed clients could all upload their individual parts and then have a single client combine al the parts into a single object...possible?

If the benchmark is to create a single file, but multiple objects are created instead, I would assume it is ok as long as the namespace only shows a single object and CRUD and R/W of any size could be applied to the final single object...but it sounds like this may not always be the case with S3?

carlilek commented 2 years ago

I'm going to test this with a Vast system that I have already benchmarked using NFS... as soon as I figure out how to do that. It will be interesting to see how the results vary.

JulianKunkel commented 2 years ago

In the MPU case, how many objects are viewable at the end of the run? As far as I know, MPU only works from a single client, but it would be interesting if distributed clients could all upload their individual parts and then have a single client combine al the parts into a single object...possible?

AFAIK that was the idea behind one of the S3 plugins. Unfortunately, I wasn't able to make it runnable as probably an early version of the client library would be necessary. The code is still there. I developed the libs3 plugin.

If the benchmark is to create a single file, but multiple objects are created instead, I would assume it is ok as long as the namespace only shows a single object and CRUD and R/W of any size could be applied to the final single object...but it sounds like this may not always be the case with S3?

Yes, indeed they look like a single object to the mdtest/mdworkench/ior level.

carlilek commented 2 years ago

A couple of notes on this, starting out:

prepare-s3.sh should be executable to correspond with the README-S3.md command (this is trivial, but...)
the config-s3.ini should have a more concrete example of the format needed for the --S3-libs3.host flag
Similarly, the output of ./bin/ior -a=S3-libs3 -h should be amended to include exact formatting needed (of course this is separate from this pull request)

carlilek commented 2 years ago

Another question I have about this is how Find is/would be implemented. I see that it's disabled in the config-S3.ini file, and I haven't attempted to enable it yet.

carlilek commented 2 years ago

If the benchmark is to create a single file, but multiple objects are created instead, I would assume it is ok as long as the namespace only shows a single object and CRUD and R/W of any size could be applied to the final single object...but it sounds like this may not always be the case with S3?

Yes, indeed they look like a single object to the mdtest/mdworkench/ior level.

I don't think this is the case, based on a listing of my bucket. Example:

...
2022-04-20 15:05        47008  s3://ior201/bu_datafiles_iorbthard_fileb-587835040-47008
2022-04-20 15:05        47008  s3://ior201/bu_datafiles_iorbthard_fileb-587882048-47008
2022-04-20 15:05        47008  s3://ior201/bu_datafiles_iorbthard_fileb-587929056-47008
2022-04-20 15:05        47008  s3://ior201/bu_datafiles_iorbthard_fileb-587976064-47008
2022-04-20 15:05        47008  s3://ior201/bu_datafiles_iorbthard_fileb-588023072-47008
2022-04-20 15:04        47008  s3://ior201/bu_datafiles_iorbthard_fileb-58807008-47008
2022-04-20 15:05        47008  s3://ior201/bu_datafiles_iorbthard_fileb-588070080-47008
2022-04-20 15:05        47008  s3://ior201/bu_datafiles_iorbthard_fileb-588117088-47008
2022-04-20 15:05        47008  s3://ior201/bu_datafiles_iorbthard_fileb-588164096-47008
2022-04-20 15:05        47008  s3://ior201/bu_datafiles_iorbthard_fileb-588211104-47008
2022-04-20 15:05        47008  s3://ior201/bu_datafiles_iorbthard_fileb-588258112-47008
2022-04-20 15:05        47008  s3://ior201/bu_datafiles_iorbthard_fileb-588305120-47008
2022-04-20 15:05        47008  s3://ior201/bu_datafiles_iorbthard_fileb-588352128-47008
2022-04-20 15:05        47008  s3://ior201/bu_datafiles_iorbthard_fileb-588399136-47008
2022-04-20 15:05        47008  s3://ior201/bu_datafiles_iorbthard_fileb-588446144-47008
2022-04-20 15:05        47008  s3://ior201/bu_datafiles_iorbthard_fileb-588493152-47008
...

carlilek commented 2 years ago

Another minor thing at the end of the stdout: WARNING: S3 S3_final:556 (path:ior201) "ErrorBucketNotEmpty": The bucket you tried to delete is not empty.

>s3cmd -c /root/kc.s3cfg ls s3://ior201
2022-04-20 14:59            0  s3://ior201/datafilesmdtesteasytestdir00
2022-04-20 15:09            0  s3://ior201/datafilesmdtesthardtestdir00

seattleplus commented 2 years ago

Yes, indeed they look like a single object to the mdtest/mdworkench/ior level. I don't think this is the case, based on a listing of my bucket. Example:

Right, AFAIK it is impossible to create a single object from multiple clients with S3...and would require a distributed virtualization layer similar to PLFS to create the virtualized view. I can't find anyone that has succeeded anyway, but maybe it is possible to somehow do something tricky to make s3 backend think it is a single client but in fact there is multiple clients.

In this sense, I would argue any S3 integration should simply skip ior-hard-write (although it could create the object and then run ior-hard-read). My concern here being that if s3 can implement single file as multi-file, then other FSs may decide to do the same thing. Or at least we could state that without using a single file it is ineligible for the IO500 (or simply records a 0 for ior-hard-write)

carlilek commented 2 years ago

@seattleplus I'm not sure what the value of having S3 integration in the benchmark or in the IO500 overall is if some of the tests are considered invalid or not useful.

In any case, after testing today with an equivalent node/process count, S3 was much, much slower overall (including ior_hard) than NFS.

Frankgad commented 2 years ago

Thank you all for your replies and the fruitful discussion. Thanks @carlilek for testing this.

prepare-s3.sh is now executable
config-s3.ini provides a minimal example that should work with most S3 compatible storage, /bin/ior -a=S3-libs3 -h can be referred to for further customization
The S3 API specification does not specify any way to search for a file inside buckets. AFAIK any provided S3 search functionality is done either at the client-side or using an intermediate service to accomplish that; therefore, find can not be currently implemented.
As stated by @JulianKunkel, MPU is not implemented using the libS3 IOR plugin; hence multiple objects are created.
This benchmark aims not to compare S3 to NFS or a particular parallel filesystem like Lustre, for example, as done in this paper. Still, it is a great tool to compare the different S3 implementations.

seattleplus commented 2 years ago

So this is interesting work, and definitely valuable to have S3 support in IOR/MDTest, but I guess I don't understand why we would add this directly to IO500? For example, DAOS has its own KV api that it integrates with IOR/MDTest...so I question why we want to add S3 directly to IO500?

Frankgad commented 2 years ago

S3 is a de facto storage API offered by major cloud providers like Google, IBM, and different storage vendors. From a client's point of view, it would be interesting to compare the different offerings using a full-fledged benchmark like IO500. On the other hand, integrating this into IO500 will definitely broaden its scope of usage and increase its user base.

seattleplus commented 2 years ago

Very well aware of S3 and its widespread use, my question is twofold:

Layering - Why is the io500 the right repo for supporting S3 with IOR/mdtest instead of those repo's directly?
IO500 Eligibility - On the surface it does not appear to me that S3 would be eligible for the IO500 lists, or would receive 0 on certain tests. Given this, maybe io500 isn't the right place for this until this is resolved.

JulianKunkel commented 2 years ago

We discussed this in today's meeting. We concluded that it is useful to include descriptions and samples that help other users to run the IO500 benchmark - even though in this case a submission won't be fully compliant. The patch focus is on IO500 so it fits the repo.

Please change the patch as follows. Put stuff into the directory: contrib/s3/ README.txt Here: clearly describe that S3 runs are not compliant because find is not supported yet but as this is still useful for testing and complicated to setup people might find it useful. (Hopefully in the long run, IO500 compliant runs would be possible with S3)

Additional files in that directory could be: s3.ini prepare-s3.sh

Dean and Jay, as you were missed, please share your thoughts.

Message ID: @.***>

Frankgad commented 2 years ago

Thank you for your feedback. The patch was moved to contrib/s3 as requested.

Frankgad commented 2 years ago

Hi, the prepare-s3.sh script was updated as requested, and i also made some slight improvements to the prepare.sh script.

IO500 / io500

how to use IO500 against S3 #54