gluster / glusterd2

[DEPRECATED] Glusterd2 is the distributed management framework to be used for GlusterFS.
GNU General Public License v2.0
167 stars 82 forks source link

Quota Integration #421

Open aravindavk opened 6 years ago

aravindavk commented 6 years ago

Integrate Quota feature with Glusterd2

Existing CLI

 $gluster volume quota help

gluster quota commands

volume inode-quota <VOLNAME> enable - Enable/disable inode-quota for <VOLNAME>
volume quota <VOLNAME> {alert-time|soft-timeout|hard-timeout} {<time>} - Set quota timeout for <VOLNAME>
volume quota <VOLNAME> {enable|disable|list [<path> ...]| list-objects [<path> ...] | remove <path>|     remove-objects <path> | default-soft-limit <percent>} - Enable/disable and configure quota for <VOLNAME>
volume quota <VOLNAME> {limit-objects <path> <number> [<percent>]} - Set the maximum number of entries allowed in <path> for <VOLNAME>
volume quota <VOLNAME> {limit-usage <path> <size> [<percent>]} - Set maximum size for <path> for <VOLNAME>
volume quota help - display help for volume quota commands

Need to come up with the list of steps/transactions involved for each command.

sanoj-unnikrishnan commented 6 years ago

we do not need inode-quota enable / disable separately. quota enable command also enables inode-quota

there is no separate option for inode-quota in quota translator (It exists in marker though). Currently, 1) enabling quota also enables inode-quota and 2) quota has to be enabled to use inode-quota Also, there is no disable option for inode-quota alone. So the command serves no useful purpose.

Testing logs (validated above in source as well):

[root@dhcp35-100 glusterfs]# gluster v inode-quota v1 enable quota command failed : Quota is disabled, please enable quota

[root@dhcp35-100 glusterfs]# gluster v quota v1 enable volume quota : success

[root@dhcp35-100 glusterfs]# gluster v inode-quota v1 enable quota command failed : Inode Quota is already enabled

[root@dhcp35-100 glusterfs]# gluster v inode-quota v1 disable Invalid quota option : disable

Will reply the behaviour of other commands in subsequent comments

sanoj-unnikrishnan commented 6 years ago

Quota uses a quota.conf file to save gfid of all the directories where the quota limits has been set/configured. This will help quota list command to find all directories with limit set (without doing a crawl ). File is not a very efficient for the purpose. We could explore other methods

cli_quota list_all: This command lists all the directories that has quota limit set. It reads indvidual gfid from quota.conf and gets the path for the gfid and limit by doing an GF_AGGREGATOR_GETLIMIT RPC to quotad. cli_cmd_quota_cbk -> cli_cmd_quota_handle_list_all -> read indivaidual gfid from quota.conf and do an rpc to quotad :cli_quotad_getlimit (GF_AGGREGATOR_GETLIMIT ) -> cli_quotad_getlimit_cbk -> print the list

cli_quota_list If the path is given then in op_stage_quota phase a temporary auxilary mount is created. Subsequently the cli does a getxattr to obtain the limit configured and displays accordingly. THe aux mount is unmounted after cli does the listing cli_cmd_quota_cbk -> gf_cli_quota ->glusterd_op_quota -> glusterd_op_stage_quota ->...glusterd_op_quota....-> gf_cli_quota_cbk->gf_cli_quota_list->print_quota_list_from_mountdir->getxattr...

quota limit-usage/ limit-objects : -> staging phase: create aux mount get the gfid for the path from by reading the backend brick xattr ->commit phase: Fetch existing soft limit percentage by reading xattr update hard and soft limit by doing setxattr on the auxilary mount add the gfid entry in the quota.conf if not previously present remove aux mount

quota remove/remove-objects : -> staging phase: create aux mount get the gfid for the path from by reading the backend brick xattr ->commit phase: remove the limit xattr on the auxilary mount remove the gfid entry in the quota.conf if not previously present remove aux mount

Will cover remaining commands in next comment

aravindavk commented 6 years ago

Thanks for providing more details.

Quota uses a quota.conf file to save gfid of all the directories where the quota limits has been set/configured. This will help quota list command to find all directories with limit set (without doing a crawl ). File is not a very efficient for the purpose. We could explore other methods

Is this file stored in all nodes? If yes, how this will be updated in all nodes when a new entry is added or removed.

cli_quota list_all:

cli_quota_list quota limit-usage/ limit-objects : quota remove/remove-objects :

sanoj-unnikrishnan commented 6 years ago

Is this file stored in all nodes? If yes, how this will be updated in all nodes when a new entry is added or removed.

Yes, the files are individually modified on each node by its local glusterd (during commit phase). As discussed we could keep this info with etcd (would resolve couple of issues with keeping quota.conf).

Does this required aux mount? to be run on all Volume nodes or one node?

"quota list all " does not require aux mount , However "quota list requires one. The difference is when we do list all, we need all directories with limits, which we get from quota.conf. But to get the to path from gfid in it we need to do a lookup ancestry_path lookup. So for command "quota list all", we get the size and ancestry info by doing a rpc to quotad for each gfid. For "quota list" comand we have path supplied at command line, so doing an aux mount and fetching size suffices.

If I understand right, this need to be run in any one node.(On the CLI initiated node). Is that correct? quota.conf need to be synced to all nodes?

Correct.

sanoj-unnikrishnan commented 6 years ago

quota enable

quota disable

Note: The crawler can take time proportionate to FS size, so the command must return once crawler is started. One crawler is spawned for each brick in the volume.

Quota contri and size xattrs are versioned. The version is incremented on enable operations. THis is because disable crawl can take long time. (so u can have a disable cleaning older version xattr while the volume is at a newer quota-xattr version).

alert-time|soft-timeout|hard-timeout These are set via volume options after basic input validation

aravindavk commented 6 years ago

Note: The crawler can take time proportionate to FS size, so the command must return once crawler is started. One crawler is spawned for each brick in the volume.

Crawler can be started as daemon, also we can introduce crawler status API since we can get daemon state and crawler can maintain status file.

aravindavk commented 6 years ago

Based on all the inputs from Sanoj, summary of Gd2 and Quota integration is as bellow, Please add if any steps/validations missing

Quota Enable

POST /quota/:volname

Validations:

Transaction steps

Note: Step 1 and 2 can be merged to single transaction step, if step 3 can be executed first then all the other steps can be combined into single Transaction step

Quota Disable

DELETE /quota/:volname

Validations:

Transaction steps

Note: Step 1 and 2 can be merged to single transaction step, if step 3 can be executed first then all the other steps can be combined into single Transaction step

Quota List

GET /quota/:volname

Optional parameters to GET request

path=<PATH>

Transaction steps (If path(s) not specified)

Step 1: (Initiated node) Read all GFIDs from quota.conf and RPC to quotad GF_AGGREGATOR_GETLIMIT Aggregate the result and return

Transaction steps (If path(s) specified)

Step 1: (Initiated node) Create temp aux mount and getxattr on the given path and return

Limit Usage

POST /quota/:volname/paths

Transaction steps

Step 1: (Initiated node)

Step 2: (All Volume nodes) add the gfid entry in the quota.conf if not previously present

Note: If GFID stored in etcd, then Step 2 can be limited to initiated node only

Limit Objects

POST /quota/:volname/objects

Step 1: (Initiated node)

Step 2: (All Volume nodes) add the gfid entry in the quota.conf if not previously present

Note: If GFID stored in etcd, then Step 2 can be limited to initiated node only

Remove Quota

DELETE /quota/:volname/paths
DELETE /quota/:volname/objects

Transaction steps

Step 1: (Initiated node)

Step 2: (All volume nodes) remove the gfid entry in the quota.conf if not previously present

Options Set

POST /quota/:volname/options

Transaction steps

Wrapper arount Volume Set

Options Get

GET /quota/:volname/options

Transaction steps

Wrapper arount Volume Get

Options Reset

DELETE /quota/:volname/options

Transaction steps

Wrapper arount Volume Reset

atinmu commented 6 years ago

Checklist

atinmu commented 6 years ago

@sanoj-unnikrishnan Looks like we haven't taken the inode-quota into the consideration.

harigowtham commented 6 years ago

@atinmu inode quota ad quota go side by side, if we turn on one, the other gets turned on.

harigowtham commented 6 years ago

when quota is enabled for the second volume, the quota enabled by the first has to be stopped and then the quota process for both the volumes will be started after the stop. This part of the patch has to be worked on. Will be done while working on disabling quota.

aravindavk commented 6 years ago

This is the issue which mention about race during Quotad restarts. https://review.gluster.org/#/c/19398/

I think in Gd2, we can take lock on "quotad" instead of lock on Volume name for Transaction to avoid this race.

// Existing
lock, unlock, err := transaction.CreateLockSteps(volName)

// All Quota commands Holds lock on "quotad", so parallel Enable/disable avoided.
lock, unlock, err := transaction.CreateLockSteps("quotad")

But with this new issue may be no guarantee for Volume state change during Quota operation.

aravindavk commented 6 years ago

Similar issue during brick operations(https://github.com/gluster/glusterd2/issues/314). Lock held on Volume name gives no protection to race with Brick resource.

Similar to Quota, this is also applicable for all Cluster daemons like bitrot and glustershd.

atinmu commented 6 years ago

On Fri, 16 Feb 2018 at 10:49, Aravinda VK notifications@github.com wrote:

This is the issue which mention about race during Quotad restarts. https://review.gluster.org/#/c/19398/

I think in Gd2, we can take lock on "quotad" instead of lock on Volume name for Transaction to avoid this race.

I think we’d need both as with out having a lock on volname would mean two transactions can start operating on the same volume which we wouldn’t want?

// Existing

lock, unlock, err := transaction.CreateLockSteps(volName)

// All Quota commands Holds lock on "quotad", so parallel Enable/disable avoided. lock, unlock, err := transaction.CreateLockSteps("quotad")

But with this new issue may be no guarantee for Volume state change during Quota operation.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/gluster/glusterd2/issues/421#issuecomment-366146671, or mute the thread https://github.com/notifications/unsubscribe-auth/AGp7mMqcnyoxSCjSPraTZE1-2JvVDP0Fks5tVQ_2gaJpZM4QdMN1 .

--

  • Atin (atinm)
aravindavk commented 6 years ago

I think we’d need both as with out having a lock on volname would mean two transactions can start operating on the same volume which we wouldn’t want?

Not two quota operation. But Quota enable and Volume stop can happen.

atinmu commented 6 years ago

On Fri, 16 Feb 2018 at 11:58, Aravinda VK notifications@github.com wrote:

I think we’d need both as with out having a lock on volname would mean two transactions can start operating on the same volume which we wouldn’t want?

Not two quota operation. But Quota enable and Volume stop can happen.

Correct. I thought you meant this for volume commands as well. So we both are thinking in the same line :-)

You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/gluster/glusterd2/issues/421#issuecomment-366154851, or mute the thread https://github.com/notifications/unsubscribe-auth/AGp7mN9aQAHHMAxtMxCMML7knAiqQzvcks5tVSAogaJpZM4QdMN1 .

--

  • Atin (atinm)
harigowtham commented 6 years ago

List of things to be done for quota

If you find that i have missed out anything please add on to the list. quota disable, enable/disable on different volumes and quota limit set patches are getting reviewed.

atinmu commented 6 years ago

@harigowtham Please provide the details on what all parts of quota APIs are we targeting to complete by GCS-Sprint1.

harigowtham commented 6 years ago

https://github.com/gluster/glusterd2/issues/966 https://github.com/gluster/glusterd2/issues/965 https://github.com/gluster/glusterd2/issues/964 https://github.com/gluster/glusterd2/issues/963

are the broken down quota integration issues.

atinmu commented 6 years ago

@harigowtham I believe this issue needs to be moved back to GCS Sprint2 given you have targeted https://github.com/gluster/glusterd2/issues/963, please confirm.

harigowtham commented 6 years ago

@atinmu yes. This is an issue to track over all quota integration. Have filed minor issues which track each and every quota issue know so far. Will mark those that are feasible for sprint 2.