OSLL / ntb-cxl

NTB/CXL Bridge
MIT License
4 stars 0 forks source link

Implement switch event notification #99

Closed mxkrsv closed 1 year ago

mxkrsv commented 1 year ago

Should close #94.

Also:

NTRDMA behavior changes somewhat:

VM1:

[  451.098441] ntrdma 0000:00:03.0: _ntc_link_enable: 1241: link enabled by ntrdma_probe
[  517.092290] ntrdma 0000:00:03.0: ntc_ntb_enabled: 783: checking if link is up
[  517.258613] ntrdma 0000:00:03.0: ntc_ntb_enabled: 783: checking if link is up
[  517.276864] ntrdma 0000:00:03.0: ntc_ntb_enabled: 783: checking if link is up
[  517.277240] ntrdma 0000:00:03.0: ntc_ntb_enabled: 787: link is up!
[  517.278051] ntc_ntb:set_memory_windows_on_device: logical MW0 @0x5d00000 len=0x100000
[  517.278153] ntc_ntb:set_memory_windows_on_device: ntb_mw_set_trans actual MW0 @0x5d00000 len=0x100000

VM2:

[  514.026858] ntrdma 0000:00:03.0: _ntc_link_enable: 1241: link enabled by ntrdma_probe
[ 1276.684079] ntrdma 0000:00:03.0: ntc_ntb_enabled: 783: checking if link is up
[ 1276.684326] ntrdma 0000:00:03.0: ntc_ntb_enabled: 787: link is up!
[ 1276.684906] ntc_ntb:set_memory_windows_on_device: logical MW0 @0x5400000 len=0x100000
[ 1276.684948] ntc_ntb:set_memory_windows_on_device: ntb_mw_set_trans actual MW0 @0x5400000 len=0x100000

During that 700 seconds that passes on VM2 it just continues to write something to inactive memory windows with constant speed. If one of the VMs unloads ntrdma module, writes stop.

Despite that messages in kernel log, however, /sys/class/infiniband/iwp0s3/link still returns 0. No idea about that yet.

mxkrsv commented 1 year ago

Despite that messages in kernel log, however, /sys/class/infiniband/iwp0s3/link still returns 0. No idea about that yet.

Got it. The ntc_ntb_enabled() function in NTRDMA (resposible for checking if link is up... link is up! in the dmesg above) runs ntc_ntb_link_set_state(), which sets the link_state field and runs some kind of a ping-pong test without setting the link_is_up field.

Ping-pong test fails (furthermore, I think that it is exactly that "write something to inactive memory windows with constant speed" thing):

root@qemux86-64:~# cat /sys/kernel/debug/ntc_ntb/0000\:00\:03.0/info 
<...>
ping_run 1
ping_miss 10
ping_flags 0x2
ping_seq 0x41b
ping_msg 0x3
poll_val 0x0
poll_msg 0x0
link_is_up 0
link_state 3
<...>
mxkrsv commented 1 year ago

Also added a few of the missing registers that are listed in #98.