weka / tools

GNU General Public License v3.0
20 stars 8 forks source link

frontend_hangingio.py #330

Closed shazad-weka closed 1 month ago

shazad-weka commented 2 months ago

The purpose of this script is to clear Client frontend hangingios. The default age value to check for hanging ios is set to 24hrs. Based on engineering hangingios older than 24hrs are safe to clear. We should only filter for state=AcquiringFileLease as this are frontend hangingios.

prior to 4.2.11, you could also get hanging IOs with operation=WEKAFS_READDIR or WEKAFS_LOOKUP, and in these cases the inode id would be of the directory containing the file and not the file itself. so the drop-cache cli wouldn’t work on it. when both clients are on 4.2.11, only WEKAFS_SET_FILE_ACCESS should hang, where the inode id is of the file itself

There are options to check and fix hangingios that are of less age for that you can use the --age flag.

You can also setup a cron job to have this run automatically so it will clear such conditions. example setup in crontab.

/30 * /root/hangingio/hangingio.py --age <default 24hrs> --log-file

shazad-weka commented 1 month ago

@vrragosta made the necessary changes.