open-iscsi / tcmu-runner

A daemon that handles the userspace side of the LIO TCM-User backstore.
Apache License 2.0
189 stars 148 forks source link

Fix tcmulib get next command on aarch64 #693

Closed geordieintheshellcode closed 1 year ago

geordieintheshellcode commented 1 year ago

Please see See https://github.com/open-iscsi/tcmu-runner/issues/688 for the problem, repro and rationale behind the fix.

geordieintheshellcode commented 1 year ago

@lxbsz thanks. I don't quite follow what you mean when you say:

Could you add the comments in the commit and add the singed-off-by at the same time ? Thanks

lxbsz commented 1 year ago

@lxbsz thanks. I don't quite follow what you mean when you say:

Could you add the comments in the commit and add the singed-off-by at the same time ? Thanks

Please see the following commit's format:

commit 245914c1446dddf07d5b58b0a7b2060b50fde4d7
Author: Xiubo Li <xiubli@redhat.com>
Date:   Tue Sep 14 16:25:23 2021 +0800

    rbd: switch strtok to strtok_r

    The strtok is not thread safe, there could be several threads will
    call the tcmu_rbd_open() at the same time.

    And sometimes we can randomly hit some errors like:

    [ERROR] tcmu_rbd_open:877 datapool/block0: Could not get image name

    But when try it again it will succeed.

    Signed-off-by: Xiubo Li <xiubli@redhat.com>
geordieintheshellcode commented 1 year ago

I've done that now @lxbsz. Thanks

lxbsz commented 1 year ago

Use the appropriate atomic load instruction to ensure we synchronize the read of the command ring head index with any previous stores. Not doing this causes command ring corruption on aarch64:

[Tue Oct 4 15:45:50 2022] cmd_id 0 not found, ring is broken [Tue Oct 4 15:45:50 2022] ring broken, not handling completions

See issue open-iscsi#688. PR open-iscsi#693.

Signed-off-by: Xiubo Li xiubli@redhat.com

NO, this is incorrect.

It should be something likes:

libtcmu: fix tcmulib to get next command on aarch64

Use the appropriate atomic load instruction to ensure we synchronize
the read of the command ring head index with any previous stores. Not
doing this causes command ring corruption on aarch64:

[Tue Oct  4 15:45:50 2022] cmd_id 0 not found, ring is broken
[Tue Oct  4 15:45:50 2022] ring broken, not handling completions

Signed-off-by: ${Your Full Name} <${Your Mail}>
geordieintheshellcode commented 1 year ago

Done! Turns git commit -s is your friend 😄