Closed psavva closed 3 months ago
Thank you for the request. The limitations on which posix operations are supported are not in the CSI driver, but mountpoint itself, which are documented here: https://github.com/awslabs/mountpoint-s3/blob/main/doc/SEMANTICS.md. I am going to close this issue and suggest you open a new issue in the mountpoint-s3 repo: https://github.com/awslabs/mountpoint-s3.
While making rsync just work does sound like a great feature, it seems unlikely mountpoint will implement things like permissions because it doesn't align with the tenets in that SEMANTICS doc. There is no close analog in s3 to unix filesystem permissions and it would be a performance hit to emulate them. It might make sense to make this opt in and is worth a discussion, but the mountpoint-s3 repo is the best place to have that discussion.
Problem Description
When attempting to use
rsync
for file synchronization between a Kubernetes pod and an AWS S3 bucket mounted via the AWS S3 CSI driver, several challenges arise. The primary issues includersync
performing filesystem operations that are not supported by S3, such as permission setting and atomic renaming. This results in errors like "Operation not permitted" and "Function not implemented," complicating the use ofrsync
for data synchronization tasks.Desired Solution
I propose the development of enhanced support for POSIX-like filesystem operations within the S3 service or specifically within the AWS S3 CSI driver to better accommodate file synchronization tools like
rsync
. The solution could include:rsync
filesystem operations to be compatible with S3 object storage behaviors.rsync
's functionality.rsync
expectations and common operation pitfalls, especially around file renaming and permissions.Alternatives Considered
To address these challenges, I have explored:
s3 sync
command for synchronization, which lacks some ofrsync
's advanced features and efficiency.rsync
command-line options to mitigate errors, which has not fully resolved the underlying compatibility issues.Additional Context
Seamless integration of
rsync
with S3 would greatly benefit a wide array of applications, from backup systems to dynamic content management for web services, by simplifying data synchronization processes. Enhancing S3's compatibility withrsync
would leverage S3's storage capabilities in distributed systems like Kubernetes, where efficient and reliable data synchronization is a frequent requirement.