gluster / gluster-block

A framework for gluster block storage
GNU General Public License v2.0
74 stars 32 forks source link

blksize: add hw block size setting support #238

Closed lxbsz closed 5 years ago

lxbsz commented 5 years ago

What does this PR achieve? Why do we need it?

Add hw block size setting support and this still will depend on the taregetcli's saveconfig support.

Does this PR fix issues?

BZ#1710833

pkalever commented 5 years ago

@lxbsz You turned to be quicker than me, I was also working on this patch and I was half way last night :-) I will review this now. Thanks!

lxbsz commented 5 years ago

@lxbsz You turned to be quicker than me, I was also working on this patch and I was half way last night :-) I will review this now. Thanks!

Ah okay, I didn't know you were also working on this :-)

This is the rtslib's fix: https://github.com/open-iscsi/rtslib-fb/pull/150

Thanks.

lxbsz commented 5 years ago

@pkalever Done, please review. Thanks.

lxbsz commented 5 years ago

@pkalever Done with switching create3 to create2->xdata, please review.

Case 1: if any of the remote nodes doesn't have GB_CREATE_BLOCK_SIZE_CAP, and cblk->blk_size > 0 then throw error and fail the create operation that cluster doesn't support and IPXYZ needs update

Case 2: if few remote nodes doesn't have GB_CREATE_BLOCK_SIZE_CAP, and cblk->blk_size == 0 then always send volServer string in the xdata

Case 3: if all remote nodes have GB_CREATE_BLOCK_SIZE_CAP, and cblk->blk_size > 0 then always send gb_xdata in the xdata, but if cblk->blk_size == 0, then always send volServer string in the xdata.

Case 4: If local nodes doesn't have GB_CREATE_BLOCK_SIZE_CAP, the blk_size is not support locally, so it will send volServer and remote will fallback and decode it as old create2

Thanks.

lxbsz commented 5 years ago

@pkalever Updated it, please review. Thanks.

pkalever commented 5 years ago

@lxbsz Hey!

This PR is introducing a regression:

[root@localhost ~]# gluster-block create sample/block1 ha 3 192.168.124.59,192.168.124.78,192.168.124.44 1MiB         
failed to configure on 192.168.124.59 configure failed     
failed to configure on 192.168.124.78 configure failed     
failed to configure on 192.168.124.44 configure failed     
RESULT:FAIL                  
[root@localhost ~]# 

[root@localhost ~]# gluster-block create sample/block1 ha 3 block_size 512 192.168.124.59,192.168.124.78,192.168.124.44 1MiB                       
failed to configure on 192.168.124.59 configure failed     
failed to configure on 192.168.124.78 configure failed     
failed to configure on 192.168.124.44 configure failed     
RESULT:FAIL 

I tried debugging the issue:

[2019-06-13 08:06:00.560954] DEBUG: command, targetcli <<EOF
/backstores/user:glfs create name=block1 size=1048576 cfgstring=sample@t/block-store/4d32b3f2-d500-4ffc-a639-0fff68c88bc5 wwn=4d32b3f2-d500-4ffc-a639-0fff68c88bc5
/backstores/user:glfs/block1 set attribute cmd_time_out=130
/backstores/user:glfs/block1/alua create name=glfs_tg_pt_gp_ao tag=1
/backstores/user:glfs/block1/alua create name=glfs_tg_pt_gp_ano tag=2
/backstores/user:glfs/block1/alua/glfs_tg_pt_gp_ao set alua alua_access_type=1
/backstores/user:glfs/block1/alua/glfs_tg_pt_gp_ano set alua alua_access_type=1
/backstores/user:glfs/block1/alua/glfs_tg_pt_gp_ao set alua alua_access_state=0
/backstores/user:glfs/block1/alua/glfs_tg_pt_gp_ano set alua alua_access_state=1
/iscsi create iqn.2016-12.org.gluster-block:4d32b3f2-d500-4ffc-a639-0fff68c88bc5
/iscsi/iqn.2016-12.org.gluster-block:4d32b3f2-d500-4ffc-a639-0fff68c88bc5 create tpg2
 /iscsi/iqn.2016-12.org.gluster-block:4d32b3f2-d500-4ffc-a639-0fff68c88bc5 create tpg3
 /iscsi/iqn.2016-12.org.gluster-block:4d32b3f2-d500-4ffc-a639-0fff68c88bc5/tpg1/luns create /backstores/user:glfs/block1
/iscsi/iqn.2016-12.org.gluster-block:4d32b3f2-d500-4ffc-a639-0fff68c88bc5/tpg1/luns/lun0 set alua alua_tg_pt_gp_name=glfs_tg_pt_gp_ano
/iscsi/iqn.2016-12.org.gluster-block:4d32b3f2-d500-4ffc-a639-0fff68c88bc5/tpg1/portals create 192.168.124.59
/iscsi/iqn.2016-12.org.gluster-block:4d32b3f2-d500-4ffc-a639-0fff68c88bc5/tpg1 enable
/iscsi/iqn.2016-12.org.gluster-block:4d32b3f2-d500-4ffc-a639-0fff68c88bc5/tpg1 set attribute  generate_node_acls=1 demo_mode_write_protect=0
/iscsi/iqn.2016-12.org.gluster-block:4d32b3f2-d500-4ffc-a639-0fff68c88bc5/tpg2/luns create /backstores/user:glfs/block1
/iscsi/iqn.2016-12.org.gluster-block:4d32b3f2-d500-4ffc-a639-0fff68c88bc5/tpg2/luns/lun0 set alua alua_tg_pt_gp_name=glfs_tg_pt_gp_ao
/iscsi/iqn.2016-12.org.gluster-block:4d32b3f2-d500-4ffc-a639-0fff68c88bc5/tpg2/portals create 192.168.124.78
/iscsi/iqn.2016-12.org.gluster-block:4d32b3f2-d500-4ffc-a639-0fff68c88bc5/tpg2 set attribute tpg_enabled_sendtargets=0  generate_node_acls=1 demo_mode_write_protect=0

/iscsi/iqn.2016-12.org.gluster-block:4d32b3f2-d500-4ffc-a639-0fff68c88bc5/tpg3/luns create /backstores/user:glfs/block1
/iscsi/iqn.2016-12.org.gluster-block:4d32b3f2-d500-4ffc-a639-0fff68c88bc5/tpg3/luns/lun0 set alua alua_tg_pt_gp_name=glfs_tg_pt_gp_ano
/iscsi/iqn.2016-12.org.gluster-block:4d32b3f2-d500-4ffc-a639-0fff68c88bc5/tpg3/portals create 192.168.124.44
/iscsi/iqn.2016-12.org.gluster-block:4d32b3f2-d500-4ffc-a639-0fff68c88bc5/tpg3 set attribute tpg_enabled_sendtargets=0  generate_node_acls=1 demo_mode_write_protect=0

/backstores/user:glfs/block1 saveconfig
EOF [at block_svc_routines.c+4537 :<block_create_common>]
[2019-06-13 08:06:02.175056] ERROR: backend creation failed for: sample/block1 [at block_svc_routines.c+4117 :<blockValidateCommandOutput>]
[2019-06-13 08:06:02.175115] DEBUG: raw output, targetcli shell version 2.1.fb49
Copyright 2011-2013 by Datera, Inc and others.
For help on commands, type 'help'.

/> /> /> /> /> /> /> /> /> Created target iqn.2016-12.org.gluster-block:4d32b3f2-d500-4ffc-a639-0fff68c88bc5.
Created TPG 1.
/> Created TPG 2.
/> Created TPG 3.

Note config string is getting weird data, like '@t' in cfgstring=sample@t/block-store/4d32b3f2-d500-4ffc-a639-0fff68c88bc5

Which is why the storageObject creation is failing. This must be coming from volServer check in the server side command formatting ?

I have rtslib with your patch, and withouth PR#238 everything works as expected.

Please fix this!

lxbsz commented 5 years ago

@pkalever Fixed it. Thanks.

pkalever commented 5 years ago

@lxbsz While now the create command succeeds, but your blk_size is not getting reflecting:

[root@localhost ~]# gluster-block create sample/block3 ha 3 block_size 4096 192.168.124.59,192.168.124.78,192.168.124.
44 1MiB                                                                                                               
IQN: iqn.2016-12.org.gluster-block:b0b33a92-b3a2-4c2f-ae7d-846cb55f92da                                               
PORTAL(S):  192.168.124.59:3260 192.168.124.78:3260 192.168.124.44:3260                                               
RESULT: SUCCESS

[root@localhost ~]# cat /mnt/block-meta/block3             
VOLUME: sample               
GBID: b0b33a92-b3a2-4c2f-ae7d-846cb55f92da                 
HA: 3                        
ENTRYCREATE: INPROGRESS      
PRIOPATH: 192.168.124.59     
SIZE: 1048576                
RINGBUFFER: 0                
BLKSIZE: 0                   
ENTRYCREATE: SUCCESS         
192.168.124.59: CONFIGINPROGRESS                           
192.168.124.44: CONFIGINPROGRESS                           
192.168.124.78: CONFIGINPROGRESS                           
192.168.124.44: CONFIGSUCCESS                              
192.168.124.59: CONFIGSUCCESS                              
192.168.124.78: CONFIGSUCCESS                              

# cat /etc/target/saveconfig.json 
      "config": "glfs/sample@localhost/block-store/b0b33a92-b3a2-4c2f-ae7d-846cb55f92da",
      "control": "max_data_area_mb=1024,hw_block_size=512",
      "hw_max_sectors": 128,
      "name": "block3",
      "plugin": "user",
      "size": 1048576,
      "wwn": "b0b33a92-b3a2-4c2f-ae7d-846cb55f92da"
    }

You should fix this.

pkalever commented 5 years ago

@lxbsz May be I have used block_size instead of block-size ? Let me recheck.

pkalever commented 5 years ago

@lxbsz May be I have used block_size instead of block-size ? Let me recheck.

Yep, it works with block-size option for me. Let me test it with your other defense patch too.

lxbsz commented 5 years ago

@lxbsz May be I have used block_size instead of block-size ? Let me recheck.

Yep, it works with block-size option for me. Let me test it with your other defense patch too.

Yeah, we should defense the unknown option in create.

Thanks.

pkalever commented 5 years ago

Tested all the cases. Works as expected.

@lxbsz,

  1. Can we correct the grammar here: Version check failed between block servers. (the capability 'create_block_size' isn't support on host 192.168.124.44 yet, please upgrade.) to Version check failed between block servers. (the capability 'create_block_size' is not supported on host 192.168.124.44 yet, please upgrade.)

  2. What I don't like is, if we don't have the block-size capability in the local node the command still succeeds :-( which is later fixed by your other deference for non existing options patch. But I don't think we have a way to get the desired behavior, I mean fail when block-size option is not supported locally.

Ack from me for this PR. Reviewed-by: Prasanna Kumar Kalever \prasanna.kalever@redhat.com\

lxbsz commented 5 years ago

Tested all the cases. Works as expected.

Cool :-)

@lxbsz,

  1. Can we correct the grammar here: Version check failed between block servers. (the capability 'create_block_size' isn't support on host 192.168.124.44 yet, please upgrade.) to Version check failed between block servers. (the capability 'create_block_size' is not supported on host 192.168.124.44 yet, please upgrade.)

Yeah, fixed.

  1. What I don't like is, if we don't have the block-size capability in the local node the command still succeeds :-( which is later fixed by your other deference for non existing options patch. But I don't think we have a way to get the desired behavior, I mean fail when block-size option is not supported locally.

I may not get this part.

If we want to set the 'block-size' for the target when creating, we must make sure that all the HA nodes have the block-size capability, so:

a), if the local node doesn't support it, it will fail directly with unknown prompt messages. b), if local node supports it, but one of the remote nodes doesn't, it will fail with the capability errors.

But without the defense fixing for the old versions, the create will just skip the unknown options, this should be one problem if user just have a typo mistake of some options, it will just skip it too, but we should just fail it with the prompt messages.

Thanks. BRs

pkalever commented 5 years ago

@lxbsz I think we are on the same page xiubo.

Before I merge this PR, we should implement something like: if hw_block_size is supported, then support this, else define a dynamic capability saying we don't support it.

Which mean, if we targetcli/rtslib has https://github.com/open-iscsi/rtslib-fb/pull/150/ then define this capability, and support block-size option else don't.

@lxbsz does this makesense ?

Thanks!

lxbsz commented 5 years ago

@lxbsz I think we are on the same page xiubo.

Before I merge this PR, we should implement something like: if hw_block_size is supported, then support this, else define a dynamic capability saying we don't support it.

Yeah, will try to implement this.

Which mean, if we targetcli/rtslib has open-iscsi/rtslib-fb#150 then define this capability, and support block-size option else don't.

BTW, for this we need to check the rtslib version, right ?

Thanks.

lxbsz commented 5 years ago

@lxbsz I think we are on the same page xiubo. Before I merge this PR, we should implement something like: if hw_block_size is supported, then support this, else define a dynamic capability saying we don't support it.

Yeah, will try to implement this.

For this since we have already supported the cap check, no need to do other extra works IMHO, is this what you want ?

As we discussed before: 1, if current node support it, but if any of the others not, it will fall back to none support. 2, if current node do not support it, no matter the others support it or not will always fall back to none support. 3, only in case of all the nodes support it, the hw_block_size will be set.

I think this should be enough. Thanks, BRs

pkalever commented 5 years ago

@lxbsz I think we are on the same page xiubo. Before I merge this PR, we should implement something like: if hw_block_size is supported, then support this, else define a dynamic capability saying we don't support it.

Yeah, will try to implement this.

For this since we have already supported the cap check, no need to do other extra works IMHO, is this what you want ?

As we discussed before: 1, if current node support it, but if any of the others not, it will fall back to none support. 2, if current node do not support it, no matter the others support it or not will always fall back to none support. 3, only in case of all the nodes support it, the hw_block_size will be set.

I think this should be enough.

This CAP check is to check if gluster-blockd supports accepting hw_block_size option or not. What I was asking now is to see if rtslib/targetcli, supports it or not ?

What should we do is, If gluster-blockd supports GB_CREATE_BLOCK_SIZE_CAP, then check if rtslib and targetcli supports hw_block_size only then consider Block size capability is supported locally.

HTH, -- Thanks!

lxbsz commented 5 years ago

@lxbsz I think we are on the same page xiubo. Before I merge this PR, we should implement something like: if hw_block_size is supported, then support this, else define a dynamic capability saying we don't support it.

Yeah, will try to implement this.

For this since we have already supported the cap check, no need to do other extra works IMHO, is this what you want ? As we discussed before: 1, if current node support it, but if any of the others not, it will fall back to none support. 2, if current node do not support it, no matter the others support it or not will always fall back to none support. 3, only in case of all the nodes support it, the hw_block_size will be set. I think this should be enough.

This CAP check is to check if gluster-blockd supports accepting hw_block_size option or not. What I was asking now is to see if rtslib/targetcli, supports it or not ?

What should we do is, If gluster-blockd supports GB_CREATE_BLOCK_SIZE_CAP, then check if rtslib and targetcli supports hw_block_size only then consider Block size capability is supported locally.

Okay, this make sense.

I am thinking should we consider the upstream and downstream versions at the same time ? For the upstream version it at least >= v2.1.fb70, but not sure what's the downstream version then.

Thanks.

pkalever commented 5 years ago

I am thinking should we consider the upstream and downstream versions at the same time ? For the upstream version it at least >= v2.1.fb70, but not sure what's the downstream version then.

Hmm. I wish targetcli gives list of supported options with some easy commands, like '#targetcli is_support hw_block_size', but as of now, there is nothing like that :-(

Downstream builds just follow the upstream versioning as of now IMO. So lets not worry about it now. just stick to upstream only.

Thanks!

lxbsz commented 5 years ago

I am thinking should we consider the upstream and downstream versions at the same time ? For the upstream version it at least >= v2.1.fb70, but not sure what's the downstream version then.

Hmm. I wish targetcli gives list of supported options with some easy commands, like '#targetcli is_support hw_block_size', but as of now, there is nothing like that :-(

Downstream builds just follow the upstream versioning as of now IMO. So lets not worry about it now. just stick to upstream only.

Yeah, okay.

What we need to check is only the rtstlib >= fb70. I will update it later. Thanks.

lxbsz commented 5 years ago

I am thinking should we consider the upstream and downstream versions at the same time ? For the upstream version it at least >= v2.1.fb70, but not sure what's the downstream version then.

Hmm. I wish targetcli gives list of supported options with some easy commands, like '#targetcli is_support hw_block_size', but as of now, there is nothing like that :-( Downstream builds just follow the upstream versioning as of now IMO. So lets not worry about it now. just stick to upstream only.

Yeah, okay.

What we need to check is only the rtstlib >= fb70. I will update it later.

@pkalever

Done with this.

Without your https://github.com/open-iscsi/rtslib-fb/pull/151 fixing, the rtslib version check will return "GIT_VERSION", which means the rtslib-fb version will be <= 2.1.69. With it the version should be >= 2.1.69. And only when version >= 2.1.70 which will be the next release we will support the "block-size" option.

When creating the BV with the 'block-size ' option:

In case all the ha nodes have supported the rtslib-fb#150: 1, if local node support the block-size, but if there is any of the remote nodes is not, it will fail with the cap not match error. 2, if local node does not support the block-size, no matter whether the remote nodes support it or not, it will always ignore the block-size option due to the exist bug.

In case if there is any of the ha nodes does not support rtslib-fb#150: 3, if local node support the block-size, it will always fails with the cap not match error. 4, if local mode does not support the block-size, no matter whether the remote nodes support it or not, it will ignore the block-size option due to the exist bug.

Thanks. BRs

lxbsz commented 5 years ago

@pkalever Fixed them all. Thanks BRs

pkalever commented 5 years ago

@lxbsz please fix these warnings: https://paste.ubuntu.com/p/76PrkmHG2j/

pkalever commented 5 years ago

@lxbsz I have tested it and it works as expected.

Here are few improvements that we can do,

  1. fix these warnings: https://paste.ubuntu.com/p/76PrkmHG2j/

  2. When rtslib version is not enough, lets give a clue: currently we get: "errMsg":"Version check failed between block servers. (the capability 'create_block_size' is not supported on host 192.168.124.44 yet, please upgrade.)"

rather we should say 'please upgrade rtslib >= 2.1.fb70'

  1. As part of boot-up lets log rtslib version too:
    
    [2019-07-04 12:03:39.868802] INFO: Distro ID=fedora. Current kernel version: '5.1.8-300.fc30.x86_64'. [at gluster-blockd.c+408 :<gbMinKernelVersionCheck>]
    [2019-07-04 12:03:39.873201] INFO: starting with tcmu-runner version - 1.4.1 [at gluster-blockd.c+440 :<gbDependenciesVersionCheck>]
    [2019-07-04 12:03:40.105107] INFO: starting with targetcli version - 2.1.fb49 [at gluster-blockd.c+450 :<gbDependenciesVersionCheck>]
    [2019-07-04 12:03:40.702824] INFO: capabilities fetched successfully [at gluster-blockd.c+579 :<main>]
    [2019-07-04 12:03:40.713971] INFO: cli process pid: (1509) [at gluster-blockd.c+609 :<main>]
    [2019-07-04 12:03:40.714085] INFO: server process pid: (1556) [at gluster-blockd.c+599 :<main>]

3. in version.h lets put GB_MIN_RTSLIB_BLKSIZE_VERSION and GB_MIN_RTSLIB_BLKSIZE_VERSION_CODE as 2.1.69, as any lower versions to my recent patch to rtslib will return GIT_VERSION, so its fine. This enables current users who is using rtslib master to enable and use block-size option.

4. Please add a test case for block-size option in tests/basic.t

Thanks!
lxbsz commented 5 years ago

@lxbsz I have tested it and it works as expected.

Here are few improvements that we can do,

  1. fix these warnings: https://paste.ubuntu.com/p/76PrkmHG2j/
  2. When rtslib version is not enough, lets give a clue: currently we get: "errMsg":"Version check failed between block servers. (the capability 'create_block_size' is not supported on host 192.168.124.44 yet, please upgrade.)"

rather we should say 'please upgrade rtslib >= 2.1.fb70'

  1. As part of boot-up lets log rtslib version too:
[2019-07-04 12:03:39.868802] INFO: Distro ID=fedora. Current kernel version: '5.1.8-300.fc30.x86_64'. [at gluster-blockd.c+408 :<gbMinKernelVersionCheck>]
[2019-07-04 12:03:39.873201] INFO: starting with tcmu-runner version - 1.4.1 [at gluster-blockd.c+440 :<gbDependenciesVersionCheck>]
[2019-07-04 12:03:40.105107] INFO: starting with targetcli version - 2.1.fb49 [at gluster-blockd.c+450 :<gbDependenciesVersionCheck>]
[2019-07-04 12:03:40.702824] INFO: capabilities fetched successfully [at gluster-blockd.c+579 :<main>]
[2019-07-04 12:03:40.713971] INFO: cli process pid: (1509) [at gluster-blockd.c+609 :<main>]
[2019-07-04 12:03:40.714085] INFO: server process pid: (1556) [at gluster-blockd.c+599 :<main>]

Yeah, will fix the above all.

  1. in version.h lets put GB_MIN_RTSLIB_BLKSIZE_VERSION and GB_MIN_RTSLIB_BLKSIZE_VERSION_CODE as 2.1.69, as any lower versions to my recent patch to rtslib will return GIT_VERSION, so its fine. This enables current users who is using rtslib master to enable and use block-size option.

For this, in case someone is using the rtslib-2.1.69 install from rpm package not from the upstream, won't it be a problem ? Won't it will confuse him/her ?

  1. Please add a test case for block-size option in tests/basic.t

Sure. Thanks.

pkalever commented 5 years ago

For this, in case someone is using the rtslib-2.1.69 install from rpm package not from the upstream, won't it be a problem ? Won't it will confuse him/her ?

Even for package installed they will get GIT_VERSION IMHO, please check.

Thanks!

lxbsz commented 5 years ago

For this, in case someone is using the rtslib-2.1.69 install from rpm package not from the upstream, won't it be a problem ? Won't it will confuse him/her ?

Even for package installed they will get GIT_VERSION IMHO, please check.

Yeah, that's true.

Okay, I get you now.

Thanks.

pkalever commented 5 years ago

@lxbsz Merged now. Thanks!