redhat-performance / tuned

Tuning Profile Delivery Mechanism for Linux
GNU General Public License v2.0
840 stars 180 forks source link

Proposal: integration with netutils-linux #78

Open strizhechenko opened 7 years ago

strizhechenko commented 7 years ago

Hello!

I have a pet project netutils-linux. Most useful part of it is in simplifying the tuning of such parameters as RSS, RPS and XPS.

Recently one guy in twitter suggest me to reach you and suggest integration between our projects. I would be really happy if it become a part of network-latency and maybe network-throughput profiles of tuned. I know, there's a lot of work to do in my netutils - code, structure, dependency list and naming aren't perfect, but it can be fixed.

So, the question: what do you think of this idea? Also, maybe I don't know and tuned is modular enough and allow third parties to write their own plugins so I can do all the work on my own without bothering you with pull requests.

yarda commented 7 years ago

Hi,

thanks. Do you mean merging the code into the Tuned or calling your utils from the Tuned? For the latter case we would need to package the project into the Fedora and resolve cases when the package is not available on the target system (e.g. on RHEL).

For the former case it shouldn't be problem. I.e. if you want, we could merge the tuning utils functionality probably into the network Tuned plugin or maybe something could also go into the CPU plugin (e.g. maximize-cpu-freq) or scheduler plugin. I think it makes sense - it's functionality we do not have at the moment. Also the monitoring utils could be hosted in the tuned-utils subpackage. If you could do pull requests it would be great :) Otherwise we can look into it ourselves, but it will take some time.

Tuned supports 3rd party plugins, so writing your own plugin and hosting it in your project could also work, but I think the tuning functionality would nicely fit into the existent Tuned plugins.

strizhechenko commented 7 years ago

Do you mean merging the code into the Tuned or calling your utils from the Tuned? if you want, we could merge the tuning utils functionality probably into the network Tuned plugin

I don't know what's easier for both of us at this point. Let's look at both cases.

Merging the code:

calling utils from the Tuned

rss-ladder has some heuristic and good enough default behaviour, autorps too. We can detect if network device has single or multiple queues and run autorps or rss-ladder depend of this fact. But drivers of some network devices have bugs with RSS and it can lead to link freeze/shutdown. My current solution - choice of what to run is fully up to users. We can keep it the same in tuned but add some unification: file in /etc/sysconfig/tuned-network-latency with options for example:

[netutils_options]
rss_for_all_multiqueue_devices='yes'
rps_for_all_singlequeue_devices='yes'
xps_for_all_singlequeue_devices='no'
xps_for_all_singlequeue_devices='no'
rss_for_devices='eth0 eth1 eth2'
autotune_all_devices='no'
rps_for_all_vlan_devices='no'

It's just an example. I also thought that specific options may be kept in /etc/sysconfig/network-scripts/ifcfg-$device files as the ETHTOOL_OPTS do, but I think it's up to NetworkManager, not Tuned.

Tuned supports 3rd party plugins

Oh, it's very nice, I'll try to look better into docs/code to better understand how to fit my utils into existent plugins.

yarda commented 7 years ago

I afraid to broke tuned by a lot of changes.

Code review and internal tests may help with it.

It passes all netutils-linux's dependencies to the Tuned.

It maybe problem, we need to support systems with low resources - project atomic, etc. Every new dep is usually discussed a lot.

If (current) target system is only new Fedora, can we rely on python3 installed in the system and I don't need to think about things like CentOS 6 support?

Tuned doesn't support Python3 at the moment, but it will do soon - we are porting the dependencies to the Python3 (recently, I have ported python-perf from the kernel). At the moment we cannot do Python3 only due to RHEL and others backward compatibility.

Doesn't merging the code mean rewrite it from python to C?

Definitely not.

calling utils from the Tuned

I think it will require: a) optionally (not blocker) getting/packaging netutils-linux into Fedora (you could do it yourself: https://fedoraproject.org/wiki/Join_the_package_collection_maintainers) and b) made the dep/functionality optional in Tuned upstream for systems which do not have the package at the moment, e.g. RHEL (it should be doable)

Tuned supports 3rd party plugins

Well, you could create 3rd party plugin called e.g. netutils and add all the options/funcionality there, but I think it's not clean approach. Better is probably to add support to the existent upstream plugins.

strizhechenko commented 7 years ago

Okay, I'll start experiments with fpm again, I think it will take 1-2 weekends and when come back for more details/discussion. Thank you for advices!

jeremyeder commented 7 years ago

We try to be data-driven with new profiles and features, too. If you could also supply some data along with the system information in an A:B test scenario using this tool, that would be very helpful.

strizhechenko commented 7 years ago

Being data driven is a good deal, yes. I'm trying to collect A:B tests/usage results here, but people people rarely come with feedback and there are only unusual cases added by me now. However rss-ladder/autorps aren't doing any job except settings automation - RSS/RPS are well known linux kernel technologies an I'm sure there are lot of A:B tests. But if I'll help to our engineers with avoiding dropped/missed packets again I'll try to make more A:B tests. :)

jeremyeder commented 7 years ago

I almost want to say that this tool should be a feature in irqbalance, honestly (particularly it's "oneshot" mode. Have you considered that option at all? Just a thought...

strizhechenko commented 7 years ago

What tool? irqtop or rss-ladder? If rss-ladder - yeah, while I was writing previous comment I googled this article in redhat docs and I thought exactly about the same.

jeremyeder commented 7 years ago

First let me say that I am reading more of netutils and I know we lived a very similar life :-) I want to congratulate you on building something that we never had the time to do. The most we did was prototype an irq cgroup and write some shell scripts around rfs/rps and accelerated rfs for performance testing.

We have always had challenges as to which tool should "own" a portion of the system. From a distro vendor POV this separation of concerns is top-of-mind from a usability standpoint.

tuned has not yet fiddled with irqs because that is the domain of irqbalance, or custom/hand tuning. We have found this slightly less urgent because many of the users who were doing hand-tuning of IRQs found that making irqbalance NUMA-aware was "good enough", or have moved to kernel bypass networking.

Essentially I think that since we in the RHEL/CentOS/Fedora ecosystem deliver irqbalance as a default service, and irqbalance is available on a huge number of other distros, perhaps the best way forward is to integrate the autorps and netutils_options within irqbalance?

I think this might be a smoother/faster path into distributions for that particular feature. You might want to ping on the irqbalance github about this possibility. https://github.com/Irqbalance/irqbalance

The monitoring tools irqtop, softirq-top, softnet-stat-top, link-rate, snmptop...those are very cool. I would encourage you to go the Fedora route for those.

@yarda do you think there is any possible integration with irqbalance that we might do? For example we could generate a /etc/sysconfig/irqbalance file based on a tuned profile, or similar...

strizhechenko commented 7 years ago

write some shell scripts around rfs/rps and accelerated rfs

I have an issue to do it, and have no time too :D

About irqbalance - it's an interesting idea. I'll try to ping them after checking that modern irqbalance isn't doing well-working rss tuning already. :)

jeremyeder commented 7 years ago

Here is a sad one that I found in an old git tree... https://paste.fedoraproject.org/paste/0qOIJGjFMIBOZ5i3Zc5zZQ