Do not update inode stats in real time - collect during dump

gluster / glusterfs

Gluster Filesystem : Build your distributed storage in minutes

https://www.gluster.org

GNU General Public License v2.0

4.51k stars 1.07k forks source link

Do not update inode stats in real time - collect during dump #4337

Open mykaul opened 4 weeks ago

mykaul commented 4 weeks ago

Just like other counters, I think it's the wrong approach. Count when you dump. All those 'inode->table->active_size++;' and 'inode->table->active_size--;' are not needed - as we take a lock when dumping anyway, let's just count the number of entries in the purge, lru and active lists.

Originally posted by @mykaul in https://github.com/gluster/glusterfs/issues/4335#issuecomment-2063100461

CC @mohit84

jkroonza commented 4 weeks ago

Plus unless it's locked it's wrong anyway as it could be raced unless it's atomic_t types (which does a memory bus lock at time of read-update-store), and those you don't increment and decrement using ++ and/or -- operators.

mykaul commented 4 weeks ago

Plus unless it's locked it's wrong anyway as it could be raced unless it's atomic_t types (which does a memory bus lock at time of read-update-store), and those you don't increment and decrement using ++ and/or -- operators.

But it done under lock. See https://github.com/gluster/glusterfs/blob/929d71b084e1e7f995d3a2884d16438108f6f621/libglusterfs/src/inode.c#L2431

jkroonza commented 4 weeks ago

I think we do need the LRU inode entries in table kept as a counter because we also have LRU limit, which this counter is checked against to determine if we need to invalidate some entries. So in my opinion rather sack the extra cycle or two on insert/remove rather than re-counting every time we need to know how many entries there are (if we have 2m table size that's 32MB of memory being traversed minimum, trashing page caches on the CPU as we go, and that's without traversing the individual chains).

mykaul commented 4 weeks ago

I was not suggesting to remove all counters. Only those that are relevant for dump time only.