yadayada / acd_cli

An unmaintained command line interface and FUSE filesystem for Amazon (Cloud) Drive
Other
1.35k stars 165 forks source link

Massive memory usage with FUSE mount #551

Open Axadiw opened 7 years ago

Axadiw commented 7 years ago

Hi,

I have a huge problem with RAM usage with acd_cli.

I have an ACD account with couple thousands of 2-3GB files that toogether consumes couple of terabytes of storage.

I'm connecting with this account using acd_cli's FUSE option using this command:

/usr/local/bin/acdcli -nl mount /mnt/acd

Than, one of my scripts is constantly accessing and reading these files (it isn't downloading whole files all the time, it's only accessing certain parts of them) in a loop.

The problem I am experiencing is exccessive acd_cli's memory usage. Memory usage is constantly growing, and in the end (when it consumes nearly whole RAM and swap on the system) it's crasing acd_cli, so i'm not able to access directory mounted by acd_cli.

When I'm trying to force unmount the drive, and mount it again, acd_cli is freeing some of the memory, so the whole script is able work again, but this solution is fine for me, because all other processes on the system are suffering from the high RAM usage.

Do you have any clues what could be causing these issues? How can I profile acd_cli's memory usage when using FUSE?

I'm running acd_cli on

bgemmill commented 7 years ago

Read file chunks are kept in memory until the reading file handle is closed. If your reading patterns are sets of [open, read, don't close but move to next file] then you'd definitely run out of memory eventually.

If you're not mutating these files, I'd suggest finding a way to cache the portions you're interested in locally and then closing those file handles, or closing handles on files after a read if you don't expect to read again for some time.

Axadiw commented 7 years ago

Script I'm using (creepMiner referenced above) is closing file handles after reading.

I've even checked it with lsof (using lsof | grep <path_to_mounted_acd_files>, and it looks like no file handles are left opened behind.

At the same time (according to htop) acd_cli is still consuming large amounts of RAM.

Can I check it somehow what acdcli keeps in memory?

bgemmill commented 7 years ago

To clarify, when you see acdcli using a lot of memory, you can close creepminer and the usage remains?

I'd also be curious what happens if you creepmine locally for plotfiles and then rsync them to amazon to test the rest of your environment.

Also, is it possible that creepminer does mutate the files? If it's twiddling some small value in a header of a huge file, acdcli may end up reading the whole thing to write it back again on a close.

Axadiw commented 7 years ago

After closing creepminer, memory usage of acd_cli stays at the same level.

And it doesn't mutate plotfiles, it just read them.

I'm currently generating plotfiles locally (using other software, designed for plotting: https://github.com/Mirkic7/mdcct), and I'm sending them using acdcli upload command.

After that they are read by creepminer using FUSE mount

bgemmill commented 7 years ago

Can you have a go at PR #374 and see if the problem persists? I'd be especially interested if, after switching to that PR, you chown'd the files as root, and chmod'd them a+r and a-w before running creepminer as a non-root user to rule out plotfile writing.

Forgive me about being pedantic on the writing part; we had some Plex issues a while back with view count metadata changes on big movie files.

If you still see the problem, please post the acd_cli log here.

Axadiw commented 7 years ago

would --read-only option for mounting do the same trick as this chown and chmod combo?

bgemmill commented 7 years ago

While --read-only might work as intended, in the PR I mentioned we're handling attrs from within acdcli and know how those are handled.

Fuse mount arguments can be little funny: https://bugs.launchpad.net/ubuntu/+source/fuse/+bug/239792