cifsd-team / ksmbd

ksmbd kernel server(SMB/CIFS server)
154 stars 23 forks source link

ksmbd stuck on write event after attempt to shutdown #560

Open consp opened 2 years ago

consp commented 2 years ago

After I tried shutting down ksmbd gave these errors and could not be stopped:

[82661.298655] ksmbd: sock_read failed: -108
[82661.559100] ksmbd: smb_direct: Unexpected RDMA CM event. cm_id=00000000ea775640, event=timewait exit (15)
[82667.180135] ksmbd: Unable to close RPC pipe 2
[82667.180351] ksmbd: Unable to close RPC pipe 1
[82667.180572] ksmbd: Unable to close RPC pipe 0

Console output of the stuck part on reboot, I do not have the logs: image

Have not been able to reproduce it but it might have had to do with a change in the filesystem since a disk was remounted.

Hopefully the information is of use to you!

namjaejeon commented 2 years ago

ksmbd: smb_direct: Unexpected RDMA CM event. cm_id=00000000ea775640, event=timewait exit (15)

@hclee any opinion?

namjaejeon commented 2 years ago

@consp Let me know your NIC model name and kernel version.

hclee commented 2 years ago

ksmbd: smb_direct: Unexpected RDMA CM event. cm_id=00000000ea775640, event=timewait exit (15)

@hclee any opinion?

We have to terminate the loop for handling connection if RDMA_CM_EVENT_TIMEWAIT_EXIT is received like RDMA_CM_EVENT_DISCONNECTED.

consp commented 2 years ago

@consp Let me know your NIC model name and kernel version.

5.17 xanmod (also had it with the mainline version now) and a change to allow the ConnectX 4 card to be detected. Card: ConnectX 4 MCX455A

namjaejeon commented 2 years ago

We have to terminate the loop for handling connection if RDMA_CM_EVENT_TIMEWAIT_EXIT is received like RDMA_CM_EVENT_DISCONNECTED.

@hclee So, what are you going to do ?

hclee commented 2 years ago

We have to terminate the loop for handling connection if RDMA_CM_EVENT_TIMEWAIT_EXIT is received like RDMA_CM_EVENT_DISCONNECTED.

@hclee So, what are you going to do ?

I will send a patch for this issue.

namjaejeon commented 2 years ago

@hclee Thanks! @consp I have applied the patch(https://github.com/cifsd-team/ksmbd/commit/bb974ae5e9dccb246d883f305485f4acff6a3316) for this issue. Can you check it ?

consp commented 2 years ago

@hclee Thanks! @consp I have applied the patch(bb974ae) for this issue. Can you check it ?

Will install it this week, might take a while to show up/not show up as it doesn't happen very often.

consp commented 2 years ago

So far so good, the things I got it to sometimes break don't do anything bad anymore and so far no issues with ksmbd.control -s. Stability seems ok, but I will have to leave it running for longer to be sure.

namjaejeon commented 2 years ago

@consp I guess this issue may be fixed with this patch(https://github.com/cifsd-team/ksmbd/commit/5039a45f8afc519394e930fbf36eec006d107590)

consp commented 2 years ago

@consp I guess this issue may be fixed with this patch(5039a45)

Looks like it might be the case, "racy" sounds like what I experienced. I'll have a look at it when I get back from holiday.

consp commented 2 years ago

Fixed is correct, I do not get any issues anymore. @namjaejeon tnx for the fix! ksmbd restarted several times as that is part of the server hibernation procedure, power is expensive these days and I've seen no messages so far.

namjaejeon commented 2 years ago

@consp Thanks for your checking it:) Let me know if you have any issue while using ksmbd.