biggles007 commented 6 days ago

Blobfuse version: 2.3.2 Operating System: RHEL 8.6

The config file is as per below

allow-other: true

# Logger configuration
logging:
  type: syslog #|silent|base <type of logger to be used by the system. silent = no logger, base = file based logger. Default - syslog>
  level: log_warning # log_off|log_crit|log_err|log_warning|log_info|log_trace|log_debug <log level. Default - log_warning> <log_debug will also enable sdk-trace logs>

# Pipeline configuration. Choose components to be engaged. The order below is the priority order that needs to be followed.
components:
  - libfuse
  - block_cache
  - attr_cache
  - azstorage

# Block cache related configuration
block_cache:
  mem-size-mb: 5120
  path: /blobstorage/cache/archive2 # <path to local disk cache where downloaded blocked will be stored>

# Azure storage configuration
azstorage:
  type: adls
  account-name: ${account_name}
  container: ${container}
  endpoint: https://${account_name}.blob.core.windows.net
  mode: msi
  appid: ${msi_client_id}

Server build/app

The server runs a third party Digital Asset Management system that writes to a shared file location, this has been mounted via Blobfuse2. Assets/images are being archived from a live production system (that isn't cloud based) to the archive server in Azure.

Issue

The cache seems to becoming full with the data being copied in. With blobfuse not seemingly copying data into the Storage Account/Data Lake Storage at a reasonable rate. The blobfuse2 process isn't using any CPU, I can see a process of /usr/bin/du -sh /blobstorage/cache/archive2 that appears every now and then with high CPU usage, seemingly calculating the space used.

Is there a way of monitoring Blobfuse and the rate at which it is copying files into the Storage Account?

vibhansa-msft commented 5 days ago

You can disable the local disk caching and validate it helps improving the performance or not. To try this just comment out the "path" field in block-cache config. Also, if you are using very small files may be reducing the block-size to 8mb may also help. What kind of write pattern you have here, as in large number of small files or few very large files ?

biggles007 commented 5 days ago

It will be large number of small files. A hi-resolution image will be uploaded, the system then processes it to a lower resolution preview and a thumbnail. So one file is multiplied out to three. The processing is done async, as the intial written images are then added to a process queue.

I tweaked the block size to 8MB, changed to Premium SSD and also reduced down the number of post processing threads. However, I think I've hit the original issue which I seem to understand better now. Some of the processes are now in an uninterruptable state (dreaded Linux D state), seemingly caused by the following:

blobfuse2[2564]: [/blobstorage/archive2] LOG_ERR [block_cache.go (969)]: BlockCache::download : error creating directory structure for file /blobstorage/cache/archive2/assetFilestore/599/0065/0475/7yspZY2B4118cP4u01club_1.jpg::0 [mkdir /blobstorage/cache/archive2/assetFilestore/599/0065: no space left on device]

Investigations have shown the issue the process are stuck is definitely related to Blobfuse:

task:exiftool        state:D stack:    0 pid:82266 ppid: 45999 flags:0x00004004
Call Trace:
 __schedule+0x2d1/0x860
 ? writeback_single_inode+0x54/0x120
 schedule+0x55/0xf0
 fuse_set_nowrite+0xa4/0xe0 [fuse]
 ? finish_wait+0x80/0x80
 fuse_flush+0xc2/0x1c0 [fuse]
 filp_close+0x31/0x70
 __x64_sys_close+0x1e/0x50
 do_syscall_64+0x5b/0x1b0
 entry_SYSCALL_64_after_hwframe+0x61/0xc6
RIP: 0033:0x7fd302163b15
Code: Unable to access opcode bytes at RIP 0x7fd302163aeb.
RSP: 002b:00007ffea5e48a68 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fd302163b15
RDX: 00007fd302780b80 RSI: 0000000000000000 RDI: 0000000000000005
RBP: 00005629a0c622a0 R08: 0000000000000000 R09: 00005629a0e59e88
R10: 00005629a0dceb20 R11: 0000000000000246 R12: 00005629a35fb1f0
R13: 00005629a0c622a0 R14: 0000000000000000 R15: 0000000000000000

Of which there are many different entries in the logs. So, it's been writing faster, but seems that the cache has been filled up, though I'm not sure why items aren't being purged out the cache? I had tried to reduce the disk-timeout-sec from 120 seconds to 60 but I couldn't remount, it gave a nondescript error, but reverting the config back to 120 fixed the issue.

I could disable the caching, but can I be confident that all current files in the cache location have been uploaded to the Storage Account?

syeleti-msft commented 4 days ago

Hi @biggles007, you can disable the disk caching by commenting the disk path and try to mount the blobfuse again. ""I could disable the caching, but can I be confident that all current files in the cache location have been uploaded to the Storage Account?"" Block cache doesn't require the disk path to work. It is an additional parameter used for keeping the blocks for the longer time. I am currently investigating on why disk path is not working meanwhile you can try the above workaround.

syeleti-msft commented 4 days ago

If possible, please share the logs of blobfuse2 by enabling log_debug level. You can enable it using the below configuration.

logging:
  type: base
  level: log_debug
  file-path: </path/to/save/the/logfile/at>

biggles007 commented 4 days ago

I increased the storage allocated to the cache partition from 10GB to 40GB, but had to hard reboot the server in order to get the processes working. As they'd got stuck in the D state, nothing seemingly could fix that. The logs overnight showed it used almost 80% of the cache, so it was probably undersized for the amount of throughput that was being targeted to it.

Nov 22 05:44:53 XXX blobfuse2[2625]: [/blobstorage/archive2] LOG_INFO [block_cache.go (1590)]: BlockCache::checkDiskUsage : current disk usage : 20480.000000MB 77%
Nov 22 05:44:53 XXX blobfuse2[2625]: [/blobstorage/archive2] LOG_INFO [block_cache.go (1591)]: BlockCache::checkDiskUsage : current cache usage : 0%

Strangely it never seems to go above 0% on the current cache usage, but the current disk usage is variable.

When does Blobfuse2 determine the amount of free space to use? Is this only at start-up or is it constantly looking? Just if we expand the volume the mount point is on if it were to fill up, would it pick that up immediately?

Also wondering whether or not if a cache space issue is resolved, should those processes in the D state become active again?

I won't be able to change the logging level right now, just need to keep the server stable to allow the business to meet some requirements. Hopefully later next week we may be able to look at getting some more logs.

vibhansa-msft commented 4 days ago

Logs that you see in your last message are showing two different states:

Related to disk usage where you have configured temp path in block-cache. This is rouhgly 20GB in your case.
Amount of in memory cache being used. If its significantly low then % might show up as 0.

You can comment out the temp-path from block-cache config if you do not need disk persistence, that shall save all of the disk space. Disk persistence is required only if your application tends to read from same file again and again and this disk caching can save you from redownload of the same file.

biggles007 commented 4 days ago

Technically, if we have a balanced configuration we could benefit from short-term caching, as the file should be in the cache for post processing. Removing the temp-path is an option, but while things are working I need to keep them stable. We can test more once the main data load has been completed and there is less pressure on the project.

This is the first time using Blobfuse2, so learning a lot. I feel that there could be some more documentation on the various options, as right now there isn't anything (unless you can point me to it) that explains each option, saying what it's for and the values that can be set. Just seems to be some sample config files with basic comments.

Having some more details around different scenarios would be helpful, e.g. if no caching required, what settings you need. As all the sample configs have caching enabled, it seems to be a "you should cache" unless you find the docs that state to turn off caching if writing from multiple locations. If that makes sense?

vibhansa-msft commented 1 day ago

There is baseconfig.yaml file shared in our "setup" directory that explains most of the config options that we allow. Also, our README has scenarios mentioned which can help you choose what option or caching model you shall use. I do understand that there is lot of scope in improving our documentation and receiving feedback from customers is important in this regard. To answer question on your workflow, you shall use disk persistence only if your application wishes to read the same file again and again otherwise you are just dumping the data on local disk which is never going to be read again and Blobfuse is also using your CPU and disk for such operations. If scenario is purely read/write once, then better to skip this config. If you can add more details on what exactly your workflow is, I can help you setting up the correct config.

Azure / azure-storage-fuse

Blobfuse not writing to HNS enabled Storage Account fast enough #1573

Server build/app

Issue