LINBIT / drbd

LINBIT DRBD kernel module
https://docs.linbit.com/docs/users-guide-9.0/
GNU General Public License v2.0
587 stars 100 forks source link

Fixes for RDMA #66

Closed mtisza closed 1 year ago

mtisza commented 1 year ago

I've spent some time debugging issues with RDMA. Without these patches RDMA did not work at all (crashes, hangs, random ping timeouts, ...). After these patches it works quite well.

This does resolve https://github.com/LINBIT/drbd/issues/58, as well as other issues.

This might seem like a duplicate of https://github.com/LINBIT/drbd/pull/65, but there was an issue I failed to detect on that one prior to submitting it (build issue due to rebasing onto latest master).

LinbitPRBot commented 1 year ago

Hi @mtisza!

Thanks for your contribution to the LINBIT software!

Development for this project happens on mailing lists, rather than on GitHub - this GitHub repository is a read-only mirror that isn't used for accepting contributions. So that your change can become part of our software, please email it to us as a patch.

Here's what to do:

How do I format my contribution?

Firstly, all contributions need to be formatted as patches. A patch is a plain text document showing the change you want to make to the code, and documenting why it is a good idea.

You can create patches with git format-patch.

Secondly, patches need 'commit messages', which is the human-friendly documentation explaining what the change is and why it's necessary.

Who do I send my contribution to?

There are two mailing lists:

If you're interested in DRBD development, subscribing to these mailing lists is a good idea.

How do I send my contribution?

Use git send-email, which will ensure that your patches are formatted in the standard manner. In order to use git send-email, you'll need to configure git to use your SMTP email server.

For more information about using git send-email, look at the Git documentation or type git help send-email. There are a number of useful guides and tutorials about git send-email that can be found on the internet.

How do I get help if I'm stuck?

Firstly, don't get discouraged, we are here to help! If you are lost in the process, and really tried, you will usually find contact information in header/implementation files, or see who touched the code with git blame. If it was an @linbit.com person, write to them. We are more interested in good patches than strictly following the rules (but you should try first!).

I sent my patch - now what?

You wait.

You can check that your email has been received by checking the mailing list archives for the mailing list you sent your patch to. Messages may not be received instantly, so be patient. Developers are generally very busy people, so it may take a few days, even weeks before your patch is looked at.

Then, you keep waiting. It is fine to kick us again if you did not receive an answer within 2 weeks, but usually we are a lot faster.

Further information

Happy hacking!

This message was posted by a bot - if you have any questions or suggestions, please talk to my owner, @rck

wzrdtales commented 10 months ago

@mtisza did you ever send this out by mail to them?

mtisza commented 10 months ago

@wzrdtales yes, but not via the mailing lists. My changes have all been accepted, with slight modifications, and are now in the master branch. Linbit just published a blog yesterday about the collaboration https://linbit.com/blog/how-yellowbrick-data-contributed-to-improving-linbit-open-source-software/

wzrdtales commented 10 months ago

ah ok, do you know if those fixes also restored compatibility to the actual OFED drivers, or just the INBOX ones?

wzrdtales commented 10 months ago

really glad to see someone found time to work on this, I never found time unfortunately, so thank you great work :)

mtisza commented 10 months ago

I do believe anything that was already working, will continue to work. The drivers that use the hard interrupt context are what was primarily broken. The soft interrupt ones should continue to work, and will benefit from the fixes too. That being said I personally only tested it in one setup, which is using mlx4 driver on connectx3 NICs.

wzrdtales commented 10 months ago

mlx4 driver doesn't say if it is the INBOX or the OFED. They are principally named the same. Did you explicitly install the OFED driver, or did you just plug your card in and go? If the latter then it is the INBOX drivers, which are less performant in some cases ( including some rdma ones) and lack some features.

mtisza commented 10 months ago

Got it. I installed the MLX-OFED drivers built from v4.9-6.0.6.0, since that was the most recent version at the time that still supports the older CX3 NICs.

wzrdtales commented 10 months ago

Ic, ok good to know. Guess I would still need to retest, since we last time tested on at least cx6. Will see when that fits into schedule