Microsemi / switchtec-kernel

A kernel module for the Microsemi PCIe switch
GNU General Public License v2.0
45 stars 31 forks source link

Failing to register device as NTB 5.14.0 #109

Closed Prashankalikotayt closed 2 years ago

Prashankalikotayt commented 2 years ago

I have installed the latest switchtec-kernel in debian 9.13. Hardware being used is : PCI4-AD-x16HE-MG4 MS X16 EXT Host Adapter Card (this has Microchip PM40036 pcie switch).

Not able to figure out the issue.

dmesg output :

"13.947251] iTCO_wdt iTCO_wdt.1.auto: unable to reset NO_REBOOT flag, device disabled by hardware/BIOS [ 13.978934] switchtec switchtec0: Error setting up reserved lut window: 00030002 / 00000000 [ 13.978990] switchtec switchtec0: failed to register ntb device: -5

" PFA complete dmesg output dmesg.txt

lsgunth commented 2 years ago

Chances are the problem is with the configuration of the switch. It use to be that the example NTB configuration provided by Microsemi was close (save the note on the README).

However, this error is indicating "LUT Window Exceeds BAR Size" which suggests there is not enough space configured for the LUTs in BAR1 in the configuration.

Prashankalikotayt commented 2 years ago

I have tried what you have suggested still getting the same error. However I dint understand what do you mean by "It use to be that the example NTB configuration provided by Microsemi was close (save the note on the README)."

lsgunth commented 2 years ago

I meant that Microsemi used to provide an example NTB configuration that didn't quite work with the switchtec_ntb driver, but the README has a not on what needs to be configured.

What did you try exactly? How big are the BARs configured to be? What are the direct and LUT window configurations?

Prashankalikotayt commented 2 years ago

This is the NTB configuration file PFA provided by microchip. I have tried making some changes but that did not work. Since I am a novice I am not sure about the changes I have made. cfgFile_40036_for_git.csv Screenshot 2021-12-14 102916 Screenshot 2021-12-14 102943 Screenshot 2021-12-14 103104 Screenshot 2021-12-14 103159

Ilya-Novikov commented 2 years ago

Probably, you should to check Management settings -> Partitions -> GAS access vector. Set GAS access vector to allow access from both partitions to each partition, MRPC and Global Event sections.

lsgunth commented 2 years ago

It also looks like Bar2 and Bar4 need to be enabled in NT Endpoint Settings

Prashankalikotayt commented 2 years ago

Thanks @Ilya-Novikov I have done that in the configuration however it doesnt work. Thanks lsgunth I tried that still no luck. Looks like due to time zones the communication is getting delayed. Lets me put it together : What I want to achieve :

Queries :

Thanks

lsgunth commented 2 years ago

Does it fail in the same way? Do the bars show up in the management endpoint in lspci?

Device sharing is a complicated topic that's not supported by the upstream driver. You will have to do a fair bit of work to get it working. I suggest you contact your Microchip rep about this.

Prashankalikotayt commented 2 years ago

It fails in the exact same way. The BARS do show up in the management endpoint.

lsgunth commented 2 years ago

If the BARS are not showing up there's still a problem with the confg or the new config is not getting enabled correctly.

jborz27 commented 2 years ago

I noticed that neither translation mode (direct window or LUT) are enabled in the config for the NT BARs.

Prashankalikotayt commented 2 years ago

PFA for the lspci output of the system. @jborz27 Actually I have tried enabling either of them as well but still no luck or maybe I am not configuring the Direct Window or LUT correctly, any pointers on how to exactly configure them(Direct Window or LUT) lspci.txt ?

lsgunth commented 2 years ago

I noticed that neither translation mode (direct window or LUT) are enabled in the config for the NT BARs.

They should not be enabled in the configuration. The driver creates these.

The problem is there is no BAR 1.

Prashankalikotayt commented 2 years ago

For BAR 1 to be visible BAR 0 should be set as 32-bit addressing, is that so? However I have tried doing that making BAR 1 visible after putting BAR 0 as 32 bit addressing, still I see the same error.

lsgunth commented 2 years ago

No, sorry I meant the problem is BAR2.

Prashankalikotayt commented 2 years ago

Can you tell what else could be tried.

On Tue, 21 Dec 2021, 23:18 Logan Gunthorpe, @.***> wrote:

No, sorry I meant the problem is BAR2.

— Reply to this email directly, view it on GitHub https://github.com/Microsemi/switchtec-kernel/issues/109#issuecomment-998976753, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI3BILMF6WXDAZZ3NZPIUALUSC4YTANCNFSM5JWFDGGQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>

Prashankalikotayt commented 2 years ago

We see BAR2 as on the last lspci output which I had shared

On Tue, 21 Dec 2021, 23:27 Prashant Kalikotay, < @.***> wrote:

Can you tell what else could be tried.

On Tue, 21 Dec 2021, 23:18 Logan Gunthorpe, @.***> wrote:

No, sorry I meant the problem is BAR2.

— Reply to this email directly, view it on GitHub https://github.com/Microsemi/switchtec-kernel/issues/109#issuecomment-998976753, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI3BILMF6WXDAZZ3NZPIUALUSC4YTANCNFSM5JWFDGGQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>

lsgunth commented 2 years ago

Oh, sorry I misread some of that. What's the top of the NT endpoint settings page look like? "Maxmum Number of NT Lut Entries" specifically?

Prashankalikotayt commented 2 years ago

Finally, the device got registered as you pointed out "Maximum Number of NT Lut Entries" it was set to 0 and I dint look into that. Thanks a lot.

lsgunth commented 2 years ago

Great! Good to hear.