New Server Set up - Githubissues

NateBrady23 commented 4 months ago

Good morning, friends!

We are working through some issues with the new servers. Nothing serious, but it's required ordering some extra parts/cables and the delay will be a bit longer. I appreciate everyone's patience while we work through this. We're attempting to get the 40-gigabit fiber setup working, some power issues, and the SFP connectors don't fit in our current enclosure.

itrofimow commented 4 months ago

Hi!

Could you @NateBrady23 please share the specs of the new servers? My framework requires some manual tuning of its configuration for the best performance, and I'd like to do that upfront, if possible.

joanhey commented 4 months ago

HI, the good fact will be to show, the frameworks that work better without any change !! And that need to be an enhancement to any framework !!

@NateBrady23 please run the first run with the new servers, with the last full run commit: [0ec8ed488ec87718eaee9ed05c0ffd51ca48113b] (https://github.com/TechEmpower/FrameworkBenchmarks/tree/0ec8ed488ec87718eaee9ed05c0ffd51ca48113b)

And later we need to show the last run id, from both servers.

joanhey commented 4 months ago

:confused:
please we need more info:

We undersstand that you are busy, but please send news !!

itrofimow commented 4 months ago

And that need to be an enhancement to any framework !!

In general I agree, but I prefer to tune things for the extreme use-cases, and benchmarking is definitely one of such cases. Users of my framework (myself included) are fine with tuning it for their specific production workloads, and if what you maintain hits its best numbers for any workload possible without even a slight manual tuning -- that's a thing to be really proud of, I think.

please run the first run with the new servers, with the last full run commit

This I second

sebastienros commented 4 months ago

All machines are identical with these specs

Intel(R) Xeon(R) Gold 6330 CPU @ 2.00GHz 56 logical cores, 1 socket, 1 NUMA, 64 GB RAM 40Gbit/s network SSD 960GB

Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         46 bits physical, 57 bits virtual
  Byte Order:            Little Endian
CPU(s):                  56
  On-line CPU(s) list:   0-55
Vendor ID:               GenuineIntel
  Model name:            Intel(R) Xeon(R) Gold 6330 CPU @ 2.00GHz
    CPU family:          6
    Model:               106
    Thread(s) per core:  2
    Core(s) per socket:  28
    Socket(s):           1
    Stepping:            6
    CPU max MHz:         3100.0000
    CPU min MHz:         800.0000
    BogoMIPS:            4000.00
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fx
                         sr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts re
                         p_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx
                         est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_t
                         imer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 invpcid_single
                         ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase ts
                         c_adjust bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma
                          clflushopt clwb intel_pt avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_
                         llc cqm_occup_llc cqm_mbm_total cqm_mbm_local split_lock_detect wbnoinvd dtherm ida arat pln pt
                         s hwp hwp_act_window hwp_pkg_req avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq av
                         x512_vnni avx512_bitalg tme avx512_vpopcntdq la57 rdpid fsrm md_clear pconfig flush_l1d arch_ca
                         pabilities
Virtualization features:
  Virtualization:        VT-x
Caches (sum of all):
  L1d:                   1.3 MiB (28 instances)
  L1i:                   896 KiB (28 instances)
  L2:                    35 MiB (28 instances)
  L3:                    42 MiB (1 instance)
NUMA:
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-55
Vulnerabilities:
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Mmio stale data:       Mitigation; Clear CPU buffers; SMT vulnerable
  Retbleed:              Not affected
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl and seccomp
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; Enhanced IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence
  Srbds:                 Not affected
  Tsx async abort:       Not affected

SSD - 960GB

Network

       description: Ethernet interface
       product: MT28908 Family [ConnectX-6]
       vendor: Mellanox Technologies
       physical id: 0
       bus info: pci@0000:10:00.0
       logical name: ens1f0np0
       version: 00
       capacity: 40Gbit/s
       width: 64 bits
       clock: 33MHz
       capabilities: pciexpress vpd msix pm bus_master cap_list rom ethernet physical fibre 1000bt-fd 10000bt-fd 25000bt-fd 40000bt-fd autonegotiation
       configuration: autonegotiation=on broadcast=yes driver=mlx5_core driverversion=5.15.0-73-generic duplex=full firmware=20.33.1048 (MT_0000000594) ip=10.0.0.121 latency=0 link=yes multicast=yes port=fibre
       resources: irq:18 memory:b0000000-b1ffffff memory:b2000000-b20fffff

franz1981 commented 4 months ago

Mellanox!? Juicy!

volyrique commented 4 months ago

Sounds great! While the faster network won't help with the majority of the tests (only the cached queries and plaintext tests should see an improvement, and maybe the fortunes one, since it was doing around 5 Gb/s of network traffic, if I am not mistaken), the doubling of the cores and the jump from the Skylake to the Ice Lake microarchitecture should (the latter should not require Spectre mitigations that are as harsh, I believe).

56 physical cores

It is actually 28 cores and 56 threads, visible from the lscpu output.

sebastienros commented 4 months ago

It is actually 28 cores and 56 threads, visible from the lscpu output.

Right, my comment is wrong.

synopse commented 4 months ago

Even for a corporation, it is a pretty huge and unusual setup, especially the network part.

Only the SSD is a weird chose: a SATA version for database process? In 2024? Really?

NateBrady23 commented 4 months ago

Thanks for providing the update @sebastienros! Sorry this setup is taking so long. It's been a matter of ordering things and people in the office at the right time to work on it. @msmith-techempower is doing some work with this today and I'm in on Thursday.

msmith-techempower commented 3 months ago

Just as a general update - I am really trying to get these up and working, but the going is slow given that I am not an IT professional by trade 😅. I know everyone, myself included, is anxious to get the continuous runs back up as soon as possible, and I don't want anyone thinking we are sitting on our hands.

msmith-techempower commented 3 months ago

Another update - we have gotten the machines mostly spun up and verified (using iperf as a baseline) the 40Gbps connections over fiber. We are still trying to get each machine able to connect to the internet (which has been a slog, but I think the hardware for it should be arriving today), but once that is done we will start in on the software side of setup.

Thank you to everyone for being so patient, but I am seeing light at the end of this tunnel and hope to have runs started back up soon.

Kaliumhexacyanoferrat commented 3 months ago

@NateBrady23 please run the first run with the new servers, with the last full run commit

I second this as I updated my benchmarks in the meantime and would love to see the impact independent from the hardware changes.

Looking forward to the new environment, keep up the good work!

mkvalor commented 3 months ago

I get that you guys are just about across the finish line. But I recommend updating the announement banner at the top of https://tfb-status.techempower.com/ anyway. It's a one-liner in your website's HTML (aside from publishing the change). This will encourage thousands of your site's followers and, regardless, "better late than never".

NateBrady23 commented 3 months ago

@joanhey @Kaliumhexacyanoferrat Yes, the first real run from the new servers will be with the last full run's commit. Great idea.

Pinging @msmith-techempower ^

We got the "final" parts in on Friday evening at the office. Mike, give us hope for Monday or Tuesday! 🙏

msmith-techempower commented 3 months ago

Hardware install complete and "flash point" tested. Everything appears to be working correctly, and one of our major concerns appears to be okay (issue with power draw). Tomorrow, I'll be getting the software environments up and running and HOPEFULLY (not promising anything - yes, you Nate) get the parity commit run started. I am sure there will be more to fix/hone/etc. in the coming week or two, but we are slowly getting the new environment on its feet.

Again, thank you all for your continued patience!

sebastienros commented 3 months ago

What version of Ubuntu are you using? 24.04 is almost there...

February 29, 2024 – Feature Freeze March 21, 2024 – User Interface Freeze April 4, 2024 – Ubuntu 24.04 Beta April 11, 2024 – Kernel Freeze April 25, 2024 – Ubuntu 24.04 LTS Released

msmith-techempower commented 3 months ago

We have 22 atm, but it may end up prudent to move to 24 when it's released since it's LTS.

volyrique commented 3 months ago

Are you using the regular kernel or the Hardware Enablement (HWE) one, as I suggested here? Using the HWE kernel essentially eliminates the need to move to Ubuntu 24.04 (when it is out) until possibly early 2025 because it would be updated to the same release as the one that 24.04 is based on, and IMHO the differences due to other software components amount to a rounding error. The switch to the HWE is done with a simple command and a reboot.

msmith-techempower commented 3 months ago

HWE

msmith-techempower commented 3 months ago

HOWDY! Okay, I believe that we have a run started. So far, nothing seems out of the ordinary, so we will see how it plays out over the next few days.

In the meantime, please be aware that this is a first attempt, and there are sure to be issues that creep up. Please report those issues here, and we will trudge on!

Again, thank you for your continued patience!

joanhey commented 3 months ago

Same run with commit https://github.com/TechEmpower/FrameworkBenchmarks/tree/625684fcc442767af013de2dfd1fc90dd73f1744 That is the code and data in Round 22.

Old servers https://tfb-status.techempower.com/results/66d86090-b6d0-46b3-9752-5aa4913b2e33

New servers ~https://tfb-status.techempower.com/results/1aefa081-5641-4e7a-a712-e85c4bf3a4e1~ https://tfb-status.techempower.com/results/cdec9eaf-19ea-48d2-bfa4-df15afbe3236

joanhey commented 3 months ago

About the kernels: Last Ubuntu 22.04.4 (February 2024) change to Kernel 6.15 (from 5.15) https://ubuntu.com/about/release-cycle#ubuntu-kernel-release-cycle We didn't see this change !!

New Ubuntu 24.04 come with Kernel 6.8. And the next Ubuntu 22.04.5 also it will come with 6.8 (after 24.04).

Network-related: Linux 6.8 includes networking buffs that provide better cache efficiency. This is said to improve “TCP performances with many concurrent connections up to 40%” – a sizeable uplift, though to what degree most users will benefit is unclear.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3e7aeb78ab01

We want it, but we will check it !!

joanhey commented 3 months ago

The actual run is stuck !!

synopse commented 3 months ago

Yes, the page is not refreshed since yesterday: last updated 2024-03-27 at 4:02 PM https://tfb-status.techempower.com/

msmith-techempower commented 3 months ago

Confirmed - I am looking into it now. Appears to have been a thermal issue on the primary machine. About 4 hours (I think) into the run the machine shut itself down.

NateBrady23 commented 3 months ago

Ok things are back up and running and we're still monitoring.

Just so you guys know, all of us at techempower get an email when the citrine environment stops getting updates. You don't have to add to the thread or open issues when it crashes; it may happen a few more times. But appreciate everyone's enthusiasm!

msmith-techempower commented 3 months ago

OKAY.

Little update. TechEmpower is located in a small office and we do not have a dedicated server rack any longer - we bought a small rack that has insulation (it's very loud), but that resulted in the switch being too close to the app server... and it produces a TON of heat which, in turn, tripped the heat sensor on the intake of the machine, which fired off a safety shutdown.

I fiddled with a bunch of setups, but what seems to be working at the moment is having the switch powered down, and plugging the fiber directly. So, App is connected to Database on 10.0.0.x, and App is connected to Client on 10.0.1.x. I tested this setup with iperf as I did with the switch and saw not appreciable difference in throughput, so I am hoping this is a fair way to test. VERY OPEN TO COMMENT HERE!

Anyway, the current run has benchmarked a couple, I am monitoring temperature (among other stats) while it is running, and hopefully we will be okay moving forward.

NateBrady23 commented 3 months ago

Have no fear, the continuous run is still going on and everything looks healthy! Just an issue with tfb-status receiving updates. Should be fixed shortly.

FYI: The parity run we're doing is with Round 22 https://tfb-status.techempower.com/results/66d86090-b6d0-46b3-9752-5aa4913b2e33

I'll be out early next week; when this run completes, it will automatically start a new run from the current state of the repo.

joanhey commented 3 months ago

Impressive numbers !! We'll need some time to analyze the numbers.

I think that will be good to create Round 22N, so the regular visitors can see the difference. Also it will be better to compare with Round 23.

volyrique commented 3 months ago

Yes, the numbers are very, very, very nice. libh2o is between 2 and 3 times faster in the tests that did not suffer from a network bandwidth bottleneck (i.e. everything except cached queries), which is more or less the expected number - we have 2 times the number of cores running at the same or slightly lower frequency and the rest of the difference could be explained by the microarchitectural improvements, the larger CPU caches, the faster and more numerous memory channels (I am assuming an 8 x 8 GB configuration), and last, but not least, the newer kernel release. The plaintext numbers seem to imply that the effect of the speculative execution vulnerability mitigations is not as bad as before because the gap between libh2o and faf decreased significantly, but that might be purely due to the kernel.

It seems that we still have the network bandwidth bottleneck for the cached queries and the plaintext tests, though in the former case only one implementation, fiber prefork, so far scaled perfectly, so it is probably not much of a problem.

I fiddled with a bunch of setups, but what seems to be working at the moment is having the switch powered down, and plugging the fiber directly. So, App is connected to Database on 10.0.0.x, and App is connected to Client on 10.0.1.x. I tested this setup with iperf as I did with the switch and saw not appreciable difference in throughput, so I am hoping this is a fair way to test. VERY OPEN TO COMMENT HERE!

I am assuming that the network adapter on the application server is dual-ported, in which case wouldn't this be a superior configuration? If the machine is connected to a switch via a single port, then the traffic both from the load generator and the database would pass through the same link, so there might be some interference, while in the current configuration everything would be nicely isolated.

mkvalor commented 3 months ago

@sebastienros Thanks for clarifying the number of physical cores later in the thread. Would you be willing to re-edit the 6th comment here, with the specs, so the top text does not continue to say, "56 physical cores, 1 socket, 1 NUMA, 64 GB RAM"? I fear some who read this will view that 'headline' and perhaps miss the later clarification.

synopse commented 3 months ago

The run did fail, and is aborting:

791/791 frameworks tested (last was zysocket-v)
398 frameworks started and stopped cleanly
393 frameworks had problems starting or stopping

Some details:

    "martian": "20240331220730",
    "martini": "error during test: [Errno 28] No space left on device",
    "may-minihttp": "ERROR: Problem starting may-minihttp",
    "microdot": "ERROR: Problem starting microdot",
    "microdot-async": "ERROR: Problem starting microdot-async",
    "microdot-async-raw": "ERROR: Problem starting microdot-async-raw",
    "microdot-raw": "ERROR: Problem starting microdot-raw",
    "microhttp": "ERROR: Problem starting microhttp",
    "microhttp-mysql": "ERROR: Problem starting microhttp-mysql",
    "micronaut": "ERROR: Problem starting micronaut",
    "micronaut-data-jdbc": "ERROR: Problem starting micronaut-data-jdbc",
    "micronaut-data-jdbc-graalvm": "ERROR: Problem starting micronaut-data-jdbc-graalvm",
    "micronaut-data-mongodb": "ERROR: Problem starting micronaut-data-mongodb",
... and all following frameworks are abandonned.

Too much Martini, perhaps, or shaken whereas it should not according to the agent 007. "martini": "error during test: [Errno 28] No space left on device" I guess something like a wrong partition (e.g. small /root) used for the log storage.

About the hardware and storage (it has nothing to do with the problem): I wonder why these servers have huge CPU, RAM and network, but a slow SATA drive. At least for the DB, the number of IOs do make a difference.

joanhey commented 3 months ago

@synopse the database data in this bench is very small and it'll fit in memory always. And it's correct for a framework benchmark. We don't want to bench the HD from the database server.

Still I have ready new database configs for this big server, but after the next run that all databases update the version. To isolate the numbers from the new version and the new config.

@volyrique the vulnerability mitigations are still a big performance problem. And the kernel help less than the new CPU. The new CPU is not affected by: Meltdown, Retbleed, ... so it don't need the vulnerability mitigations.

Vulnerabilities:
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Mmio stale data:       Mitigation; Clear CPU buffers; SMT vulnerable
  Retbleed:              Not affected
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl and seccomp
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; Enhanced IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence
  Srbds:                 Not affected
  Tsx async abort:       Not affected

IMO the only solution it is to change the CPUs with vulnerabilities, to have a good performance again.

itrofimow commented 3 months ago

@joanhey I'm pretty sure that updates generate a significant load on the disk, even with a minimal WAL level.

Should we just create the World table as UNLOGGED?

joanhey commented 3 months ago

I think that the database discussion is for another Issue. But first we need to wait for the next database versions and configs.

franz1981 commented 3 months ago

The run seems stucked...I would like to check the failures for Netty/Vertex and Quarkus (which I am a developer), because in our CI tests we didn't have anything similar...

Related being a NUMA CPU; I have to double check but I think it is a kind of NUMA arch, or better, there is not partitioning of memory, but (last level) cache accesses have etherogenous access cost. On my local machine (Ryzen 7950X) I had to enable it..more info on it at https://www.reddit.com/r/Amd/comments/ce6pj9/ccd_equivalent_to_numa_in_functionality/

joanhey commented 3 months ago

@franz1981 after the Martini framework "martini": "error during test: [Errno 28] No space left on device" All frameworks failed, so there is no need to check the failures.

synopse commented 3 months ago

@synopse the database data in this bench is very small and it'll fit in memory always. And it's correct for a framework benchmark. We don't want to bench the HD from the database server.

@joanhey On production (we would like to reproduce production state, right?) we should enable fsync on PostgreSQL. https://postgresqlco.nf/doc/en/param/fsync/ so writes should wait for the data to be actually stored on the disk, not only changed in memory. Even if the data is small enough to fit in memory, it is still written on disk, and we would need to wait for fsync. Here a fast Nvme SDD makes a difference in respect to the SATA SDD offered by this setup. We could expect better updates performance with a new hardware: updates are somewhat slow in respect to other tests in the current run.

Anyway, we have to make it pass and run all tests, before trying to maximize the hardware.

volyrique commented 3 months ago

We could expect better updates performance with a new hardware: updates are somewhat slow in respect to other tests in the current run.

Are they really? The speedup in the database updates test is in line with the one in the multiple queries test (i.e. 2-3 times faster) - just check axum [postgresql], h2o, and just-js (I am looking at the fastest results because they are the least likely to have another scalability bottleneck on the software side in the framework implementation). A fast NVMe SSD exposes more parallelism than a SATA one, but I don't think that the tests have a level of concurrency that is affected enough by this potential bottleneck; neither do I think that we are bandwidth-limited.

Obviously, we can't expect the database updates test to have the same performance as the multiple queries one - it must be slower.

joanhey commented 3 months ago

I say again:

I think that the database discussion is for another Issue. But first we need to wait for the next database versions and configs.

PD: open new issues to discuss it !! The benchmark have own live, and never, never, will be good for every one. But all frameworks play with the same rules (servers, configs, ...)

NateBrady23 commented 3 months ago

Sorry folks. This was a partitioning mistake. It's been fixed and we've restarted the Round 22 parity run.

msmith-techempower commented 2 months ago

Howdy!

The latest run completed successfully (and didn't run out of disk space this time >_<) and can be inspected here. This round was run against the same commit as Round 22, but with Ubuntu 22, the HWE kernel, and the direct fiber networking.

It looks like everything is operating smoothly. Please feel free to report if you notice anything out of the ordinary or have questions. The newest in the continuous run is the latest pull from Github, so this will include everything merged in as of this morning.

I THINK we are about ready to close this ticket, but I will leave it open for a bit longer while this next run is going.

Thanks again for the ongoing support and patience.

volyrique commented 2 months ago

@msmith-techempower I have just one comment - h2o reported the kernel version as Linux 5.15.0-101-generic #111-Ubuntu SMP Tue Mar 5 20:16:58 UTC 2024, so it looks like the HWE kernel was not running for some reason. Of course, that makes the comparison with the previous hardware even more precise 😄.

The only weirdness I have noticed in the results is the fiber-prefork result in the cached queries test, but it doesn't seem to imply any kind of issue with the benchmarking environment setup, so I wouldn't comment on it any further.

msmith-techempower commented 2 months ago

@volyrique I believe I may have jumped the gun on this one. I thought that I had installed HWE initially, but then wanted to double checks so I stopped the current run and installed it as recommended via sudo apt install linux-generic-hwe-22.04 on all the machines. Now, uname -a says Linux tfb-server 6.5.0-27-generic #28~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Fri Mar 15 10:51:06 UTC 2 x86_64 x86_64 x86_64 GNU/Linux which I think is right. I'm kicking off another run now, but let me know if there was something else we were expecting.

p8 commented 2 months ago

Hi. It seems the dashboard is currently stuck. It hasn’t update in almost a day.

msmith-techempower commented 2 months ago

@p8 Yeah, I'm troubleshooting this... we're experiencing thermal issues again and the server decided to power itself down late yesterday. Honestly, we have these in a small rack that has airflow problems, and it seems like these new machines have lower tolerances than the previous ones to heat. Weighing out options, but hard to say when we will get continuous continuous runs for the short-term.

p8 commented 2 months ago

Thanks @msmith-techempower !

joanhey commented 2 months ago

In the meantime. Where are the cloud benchmarks ??

We don't need a continuous run, but half a year or quarter !!! Many frameworks only optimize for that big enterprises servers. And the majority of users, use more moderate servers like in the cloud benchs !!

NateBrady23 commented 2 months ago

We do not have credits/funding for cloud benchmarks, nor do we have the infrastructure set up.

If someone wants to support that, including the time to maintain, we'd absolutely be open to having that discussion.

TechEmpower / FrameworkBenchmarks

New Server Set up #8736