SUSE-Enceladus / azure-li-services

Azure Large Instance Services
GNU General Public License v3.0
7 stars 0 forks source link

Basic optimization for HANA instances #17

Closed jeffaco closed 6 years ago

jeffaco commented 6 years ago

I've been looking into verification tests that we run since this project is rapidly wrapping up for LI. After investigation, there are some issues.

In brief conversations with @rjschwei earlier, he's hesitant to optimize the system in any way since "perfect" optimizations are often load dependent and thus should be performed by the customer. While that sounds good in theory, we seem to be missing some optimizations that I consider pretty basic.

For example:

# cat /etc/init.d/boot.local
cpupower frequency-set -g performance
cpupower set -b 0
echo 0 > /sys/kernel/mm/ksm/ru

Seems reasonable enough to if CPU set for performance, for example, rather than power efficiency.

BOOT_IMAGE=/vmlinuz-4.4.120-94.17-default root=UUID=8c094969-faeb-4bd0-af46-ee2f78bdf0c9 resume=/dev/mapper/3600a0980383044456a2b4b596f346d70-part3 splash=silent quiet showopts 
numa_balancing=disable transparent_hugepage=never intel_idle.max_cstate=1 processor.max_cstate=1

I'm still picking up what I need to know about image verification, so there may be other items.

Despite what @rjschwei had to say, this sort of stuff seems pretty basic. I would very much like to see images come this way (rather than writing a script, executed through YAML, that make the appropriate changes).

What say ye?

rjschwei commented 6 years ago

I have no problem with applying optimizations where we know there is no adverse effect. We can put a service in place that runs saptune for which solution would you like the tune settings applied?

saptune solution list

All solutions (* denotes enabled solution): BOBJ HANA MAXDB NETWEAVER S4HANA-APPSERVER S4HANA-DBSERVER SAP-ASE

Please understand that there is most likely an adverse performance effect if we choose S4HANA-APPSERVER but the customer sets the system up for a S4HANA-DBSERVER. Of course we can also add this to the YAML and let the customer decide ahead of time when they request the provisioning.

The stuff in /etc/init.d/boot.local is questionable as far as reliability of execution is concerned. /etc/init.d/boot.local is a carry over from SystemV init and systemd, while still supposed to run the script, has not always been reliable in doing so. We should make the settings either permanent in other ways or create a native systemd service.

jeffaco commented 6 years ago

I'm meeting with our folks that do the actual optimization early next week. At that point I should be able to get concise information of EXACTLY what we do. I'll relay back once I have that, thanks.

schaefi commented 6 years ago

Sounds good

schaefi commented 6 years ago

@jeffaco do you have any further information on this topic ?

some tuning in terms of kernel module parameters has already been done by Robert. So maybe this one is already done regarding the generic settings we can apply ?

rjschwei commented 6 years ago

@schaefi I think we will want a azure-li-performance-tune service in any event, at the very least it would run:

cpupower frequency-set -g performance
cpupower set -b 0
echo 0 > /sys/kernel/mm/ksm/ru

We can then add the handling of the tuning to this service if needed or we can run saptune in the image build if a specific profile is desired.

jeffaco commented 6 years ago

I met with operations to discuss optimizations, finally. A number of things came out. Note: the PMs may not be giving optimal ways to make changes, this is just what they do.

root # saptune solution apply HANA
root # zypper install sapconf
root # tuned-adm profile sap-hana
root # systemctl start tuned
root # systemctl enable tuned
  1. echo never > /sys/kernel/mm/transparent_hugepage/enabled
  2. Append transparent_hugepage=never to line GRUB_CMDLINE_LINUX_DEFAULT in /etc/default/grub.
  3. Similar change to YaST2 bootloader.
  1. echo never > /sys/kernel/mm/transparent_hugepage/enabled
  2. Append transparent_hugepage=never to line GRUB_CMDLINE_LINUX_DEFAULT in /etc/default/grub.
  3. Similar change to YaST2 bootloader.
  1. Set "Maximum Performance" in BIOS (I know you can't do that, it's here to be complete),
  2. Set "low latency/maximum performance" in YaST power management,
  3. Add cpupower set -b 0 to /etc/init.d/boot.local.

I can't think of a good way to relay this. I don't want to open an issue for each item, but maybe that's the right thing? Suggestions appreciated. Note that some of this may already be done.

schaefi commented 6 years ago

Can I give you this list for the image build step?

sure but I think we have mostly all included: patterns-sap-hana, saptune, tuned, sapconf. Others mentioned here: tuned-adm, sap-hana I have not in the suse repos

We do a number of saptune operations

Added #52

We disable autoNUMA We disable transparent hugepages: Configure C-States for lower latency in Linux

already done in the image via kernel cmdline options

Energy Performance Bias

Added #53

Kernel samepage merging

Added #54

Disable EDAC, both LI and VLI do that in hardware

@jeffaco How did you do that ? specific module setup in modprobe.d ?

jeffaco commented 6 years ago

As for disabling EDAC, here's the information I have from our PM (also went via E-Mail):

Open the file /etc/modprobe.d/blacklist.conf
# vi /etc/modprobe.d/blacklist.conf
#install edac_core /bin/false 
#install sb_edac /bin/false
Once remove those lines save the file and exit.
Execute below command 
# dracut -f    
# Then reboot the server and confirm edac modules are loaded by executing below command
# reboot
# lsmod | grep edac 

Let me know if you need further information, thanks.

schaefi commented 6 years ago

Thanks, that's easy. I'll catch this setup as part of the image description itself

schaefi commented 6 years ago

ok, both image build projects for Li and VLi has been adapted to perform the required blacklist.conf changes.

This completes the issue. Thanks much for the details