GoogleCloudPlatform / gcsfuse

A user-space file system for interacting with Google Cloud Storage
https://cloud.google.com/storage/docs/gcs-fuse
Apache License 2.0
2.05k stars 430 forks source link

Buckets with retention policy defined #355

Open epishova opened 5 years ago

epishova commented 5 years ago

gcsfuse doesn't work correctly with a bucket that has a retention policy defined on it.

$ mkdir mnt_bucket
$ gcsfuse --dir-mode 755  --file-mode 777 --implicit-dirs --debug_fuse gcs_bucket_with_retention ./mnt_bucket
Using mount point: /home/oracle/mnt_bucket
Opening GCS connection...
Opening bucket...
Mounting file system...
File system has been successfully mounted.
$ ls -l ./mnt_bucket/
total 3430
-rwxrwxrwx. 1 oracle dba       0 Oct 21 22:07 test.txt
$ echo Hello > ./mnt_bucket/test1.txt
$ ls -l ./mnt_bucket/
total 3431
-rwxrwxrwx. 1 oracle dba       0 Oct 21 22:07 test.txt
-rwxrwxrwx. 1 oracle dba       6 Oct 22 15:11 test1.txt
$ cat mnt_bucket/test1.txt 
Hello
cat: mnt_bucket/test1.txt: Input/output error
bjornleffler commented 5 years ago

Thanks for reporting this. I'll have a look.

danking commented 4 years ago

Hi! I think we just encountered this issue today. Do you have any pointers as to where the code might be going wrong? We could perhaps take a look and take a stab at a fix.

avidullu commented 2 years ago

Reproduced this and observed that since due to retention policy, even a temporary object (eg. vim swap files, filesystem tmp files etc.) or files cannot be deleted once created which causes issues with GCSFuse. Overwriting the file won't work, replacing the content won't work because fuse tried to recreate files on such changes, which the bucket retention policy does not allow.

To demonstrate from the commands in the first comment (issue report) Notice that an "echo Hello > ret_file.txt" goes through multiple steps a. Create a File b. Flush the file. c. Write the file. d. Flush the file.

In a bucket with retention, once b. happens, c. cannot happen because that would violate the retention policy. Hence this request is extremely hard to incorporate.

fuse_debug: 2022/04/21 11:08:20.194275 Op 0x00000007 connection.go:416] <- LookUpInode (parent 3, name "ret_file.txt", PID 24714) debug_fs: 2022/04/21 11:08:20.242206 LookUpInode(3, "ret_file.txt"): no such file or directory fuse_debug: 2022/04/21 11:08:20.242292 Op 0x00000007 connection.go:500] -> Error: "no such file or directory"

a. ====================================================================================== fuse_debug: 2022/04/21 11:08:20.242383 Op 0x00000008 connection.go:416] <- CreateFile (parent 3, name "ret_file.txt", PID 24714) debug_fs: 2022/04/21 11:08:20.388708 CreateFile(3, "ret_file.txt"): fuse_debug: 2022/04/21 11:08:20.388784 Op 0x00000008 connection.go:498] -> OK (inode 4)

b. ======================================================================================== fuse_debug: 2022/04/21 11:08:20.388984 Op 0x00000009 connection.go:416] <- FlushFile (inode 4, PID 24714) debug_fs: 2022/04/21 11:08:20.389122 FlushFile(4): fuse_debug: 2022/04/21 11:08:20.389156 Op 0x00000009 connection.go:498] -> OK () fuse_debug: 2022/04/21 11:08:20.389240 Op 0x0000000a connection.go:416] <- GetXattr (inode 4, name "security.capability", PID 24714, name security.capability) debug_fs: 2022/04/21 11:08:20.389319 GetXattr(4, security.capability): no data available fuse_debug: 2022/04/21 11:08:20.389347 Op 0x0000000a connection.go:500] -> Error: "no data available"

c. ======================================================================================= fuse_debug: 2022/04/21 11:08:20.389463 Op 0x0000000b connection.go:416] <- WriteFile (inode 4, PID 0, handle 0, offset 0, 6 bytes) fuse_debug: 2022/04/21 11:08:20.389734 Op 0x0000000c connection.go:416] <- SetInodeAttributes (inode 4, PID 24714, mtime 2022-04-21 11:08:20.386577573 +0000 UTC) debug_fs: 2022/04/21 11:08:20.415546 WriteFile(4, 0): debug_fs: 2022/04/21 11:08:20.415588 SetInodeAttributes(4): fuse_debug: 2022/04/21 11:08:20.415623 Op 0x0000000b connection.go:498] -> OK () fuse_debug: 2022/04/21 11:08:20.415624 Op 0x0000000c connection.go:498] -> OK ()

d.======================================================================================== fuse_debug: 2022/04/21 11:08:20.416752 Op 0x0000000d connection.go:416] <- FlushFile (inode 4, PID 24714) debug_fs: 2022/04/21 11:08:20.555382 FlushFile(4): fuse_debug: 2022/04/21 11:08:20.555505 Op 0x0000000d connection.go:498] -> OK ()

fuse_debug: 2022/04/21 11:08:20.555838 Op 0x0000000e connection.go:416] <- ReleaseFileHandle (PID 0) debug_fs: 2022/04/21 11:08:20.555897 ReleaseFileHandle(0): fuse_debug: 2022/04/21 11:08:20.555924 Op 0x0000000e connection.go:498] -> OK ()

As of now, there doesn't seem to be an easy way around this, but we can definitely revisit it at some point.

przsab commented 1 year ago

FYI: Azure Blob storage implemented this with AllowProtectedAppendWrites.