Closed ghost closed 5 years ago
When host receives the link force down message from peer host, the ntb_hw_switchtec driver would reinitialize the shared memory window, and NTB client(e.g. ntb_transport) would set up the translation memory window. << In case of ntb_hw_switchtec and NTB client (e.g. ntb_transport) setup memory window concurrently,
Because Switchtec uses same registers to set up memory window, the access to these registers would fail << the latter one would failed the MW setup, for the reason of configuring memory windows/requester ID mapping table required to be serialized.
This makes no sense to me. Especially this:
Because Switchtec uses same registers to set up memory window
How does the link status flag in memory relate to the switchtec registers?
Logan
Hi Logan,
Sorry for the unclear description. The issue is Step1: Load ntb, switchtec, ntb_hw_switchtec, ntb_transport and ntb_netdev modules on host. Step2: Load ntb, switchtec, ntb_hw_switchtec, ntb_transport and ntb_netdev modules on peer host. Step3: Host receives the link force down message from peer host. ntb_hw_switchteck module calls config_rsvd_lut_win() to reconfigure the shared memory window. Step4: At same time, host receives the check link message from peer host. ntb_hw_switchtec module calls switchtec_ntb_check_link() and set flag link_is_up to 1. Step5: ntb_transport module call switchtec_ntb_mw_set_trans() to configures the transport MW. Step6: Because of the function switchtec_ntb_part_op(LOCK) fails, the switchtec_ntb_mw_set_trans() would not set up the transport MW.
The reason is When function config_rsvd_lut_win() calls switchtec_ntb_part_op() to reconfigure the shared memory window, the function switchtec_ntb_mw_set_trans() calls switchtec_ntb_part_op() to configure the transport memory window. But Switchtec uses same registers to set up LUT window and transport window.
To fix it, we needs the link_is_up flag be 0 before the shared memory window is reconfigured.
Thanks Joey
Ah, I see. More of that needs to be in the commit message.
In this case you haven't fixed the problem. You've just made the race less likely. If you have two threads running nothing says the second thread doesn't do the check just before the first thread sets the flag. We either need to use a lock or make sure all the link status checking stuff happens in the same work queue (I think the second choice is better).
If we make check_link_status the work item and remove the work item for force link down it should probably fix this problem.
Hi Logan,
Thanks for the review and comments. I updated the code and commit message. Please help to review it again. In this update, we put all the link status checking stuff in the same work queue. I tested it and It can work.
Regards Joey
System work queue has no promise on the latter work item be executed after the former is done. So there is a case, when the reconfigure lut in progress by first work item, the second one had started to check link state and finally into link up state, sending link up notify to peer. Followed Peer will write the spad, which is part of the self shared lut now in configuring, data lost happen in this case. The result of data lost is to read peer wrote data, 0 will return.
Hmmm, it might work if instead of storing a message in sndev, you store the requested link state and a flag to indicate whether a force has occurred.
Hi Logan,
Thanks for the review and suggestion. I add a new message type (MSG_RETRY) to fix this issue. Please help to review it.
Regards Joey
Huh? I don't know how you got that adding a message type was the solution.... I mean to store state required for the work queue, and not the message itself. We're trying to avoid the situation caused by dropped messages.
Oh, I see. Thank you very much. I updated it, but there is another issue on ntb_transport. The ntb_transport_link_work would be canceled by ntb_transport_cleanup_work. The ntb_transport_probe() called the ntb_link_enable() and ntb_link_event().
Regards Joey
Is there a contradiction? read peers shared memory while the lut is in configuring: The force link down trigged rsvd LUT 0 configuration take long time(~= .1s by print) but in the procedure of the rsvd LUT 0 config for peer before finish, the peer have started to run enable link, check link, use the LUT 0 to read peer's shared memory(for shared link_sta)
when test with this version, found 2 issues:
@wesleywesley The reason of issue 1 is the ntb_transport_link_work canceled by ntb_transport_cleanup_work. For the issue 2, the link up work is not be executed.
Thanks Joey
@JoeyZhang-Microsemi Let's say host A: receiver and host B: sender of FORCE LINK DOWN message
For issue 1, A is keeping stuck in waiting B's peer spad write when the value of "remote version" stay 0 the questions need to get clear are: 1a.1: what kind of corner case(probably introduce in this patch when convey link up/down status info to upper lever ntb client: from combination/cooperate of ntb callbacks ops "enable link", "link is up", and ntbhw's link state state machine by the work items), lead to transport driver cancel ntb_transport_link_work? 1a.2: even the ntb_transport_link_work is canceled, but transport should schedule the work back again when ntbhw call ntb_link_event, but here not.
For issue 2, 3 work item scheduled, by only 2 executed, what's the reason?
regards, Wesley
@wesleywesley For issue 1, I have fixed it. The NTB API ntb_link_enable() should be set the flag ntb_is_up when it returns. For issue 2, I did not know the reason. But we call schedule_work() twice in very short time. It may be the reason.
Please help to review the change.
@lsgunth Thanks for the comments. I updated the code. Because the NTB API ntb_link_enable() needs update the flag link_is_up before it returns. In the change, the switchtec_ntb_link_enable/disable() directly call the link state checking stuff.
@JoeyZhang-Microsemi had tested on both xlink and non-xlink NTB setup w/o issue, merged into devel.
When host receives the link force down message from peer host, the ntb_hw_switchtec driver would reinitialize the shared memory window, and NTB client(e.g. ntb_transport) would set up the translation memory window. Because Switchtec uses same registers to set up memory window, the access to these registers would fail and the translation memory window would not be set up.
To fix it, we should disable link before reinitializing the shared memory window, and enable link after it.
Signed-off-by: Joey Zhang joey.zhang@microchip.com