ThinkParQ / beegfs

Public repository for the BeeGFS Parallel File System
https://www.beegfs.io
Other
67 stars 9 forks source link

fix bug in process closefile function which may cause metaserver crash #5

Open hmings888 opened 1 month ago

hmings888 commented 1 month ago

In InodeFileStore.cpp:closeFile,use entryID to search in the global inode map, when found, the iter point to the "inode" which maybe different from the "inode" that the closeFile function use as input parameter. In fact, the latter "inode" maybe in the parent dir's inode map but global inode map. This mistake would make the reference count of inode wrong in subsequent request of closefile, and finally cause metaserver to coredump. So add the condition to make the "inode" is what we found.

iamjoemccormick commented 1 month ago

Hi @hmings888, thanks for the PR!

One initial question if you've observed any meta crashes as a result of this, or just noticed a crash is theoretically possible based on code analysis? If you have steps to reproduce the issue, it would make it easier to analyze and test the proposed fix.

hmings888 commented 1 month ago

Hi @hmings888, thanks for the PR!

One initial question if you've observed any meta crashes as a result of this, or just noticed a crash is theoretically possible based on code analysis? If you have steps to reproduce the issue, it would make it easier to analyze and test the proposed fix.

It raised metasever segfault many times due to this, but all the coredump files generated have been deleted accidentally :( In fact,I don't completed know how to reproduce this problem in a simple test case, but when running a complicated application using beegfs,it raise often. And I deep into the codes and fixed as the mention before,it dose works:)