Azure / azure-storage-fuse

A virtual file system adapter for Azure Blob storage
Other
653 stars 206 forks source link

Files upload to SA as size zero if file cache is enabled #1505

Closed sshipway closed 1 week ago

sshipway commented 3 weeks ago

Which version of blobfuse was used?

blobfuse2-2.3.0-1.x86_64 (RPM from Microsoft repository)

Which OS distribution and version are you using?

AlmaLinux 8, kernel 4.18.0-513.5.1.el8_9.x86_64

If relevant, please share your mount command.

/usr/bin/blobfuse2 mount /mnt/sjs --config-file=/etc/sysconfig/blobfuse.sjs.yml

What was the issue encountered?

If the filecache is enabled, then any files written to the filesystem are uploaded to the storage account as zero size.

Have you found a mitigation/solution?

Disabling the filecache completely (removing the file_cache component in the configuration file) makes uploads work, but is not a good solution in cases where we do not want to wait for uploading streaming to complete

Please share logs if available.

No errors or warnings appear in the logs while this is happening.

Cache settings:

file_cache:
  path: /var/spool/blobfuse
  timeout-sec: 120
  max-size-mb: 1024
vibhansa-msft commented 3 weeks ago

When file-cache is enabled can you confirm file is created in local storage with correct size and contents. Reading through your description, I suspect the application is not closing the file handle here. File-cache will upload the file only if file-handle is closed by the application.

sshipway commented 3 weeks ago

The file is created fine; I tested using just cp /etc/motd /mnt/test to copy a file into the mount so the file handle is definitely closed. I can read the file as expected from the local mount, until I dismount and re-mount at which point it is revealed to be zero size. I can also check in the storage account and it is zero size. I checked the various reports online and saw that LVM can be a problem so I moved the cache dir to tmpfs but no luck.

vibhansa-msft commented 3 weeks ago

Kindly enable log_debug and share the log files along with the file name where you observe the issue.

vibhansa-msft commented 2 weeks ago

Were you able to collect the required logs?

sshipway commented 2 weeks ago

I've not been able to work on it recently, but I think the issue is when I have both filecache and stream enabled at the same time. When I have just one and not the other it works.

sshipway commented 1 week ago

The test file was sizetest4 and the filesystem mounted on /mnt/sjs

fuse.log.gz

@vibhansa-msft Done. I strongly suspect the issue is that you cannot have file_cache and stream configured at the same time.

vibhansa-msft commented 1 week ago

Yes, only one caching model can be used at a time, it has to either file-cache or block-cache or stream. If you configured multiple, then behavior is undefined and may result into various errors or corruption issues.

vibhansa-msft commented 1 week ago

Is that the root-cause here, if so you can correct the config and revalidate your test case.

sshipway commented 1 week ago

Since the system works correctly with streaming alone, or caching alone, then this will be the root cause. Maybe it should be made clearer in the documentation that the two cannot coexist, or (better) the system should refuse to start if both are enabled at the same time?