Closed tissatussa closed 2 years ago
Sorry it's taken me so long to get to this.
As you've seen, your CPU is modern enough that it will run any arch that I support with make release
. All versions of Halogen are the same, but may use different CPU instructions and hence some may run faster on certain hardware. You can run each and give Halogen the command bench
to compare the speed (nps
). Then you can use the one that is the fastest. Simply running make
will likely produce the fastest compile.
That being said, the speed difference is likely to only be +/- 10% and is not significant enough to affect playing strength much so I wouldn't worry too much.
I hope you continue to enjoy using Halogen.
thanks for this explanation; it's clear, but not fully ..
That being said, the speed difference is likely to only be +/- 10% and is not significant enough to affect playing strength much so I wouldn't worry too much.
when different binary versions exist, the fastest is the best, so i wanted to know how to distinguish between them.
Simply running make will likely produce the fastest compile.
Likely ? Could your make script compare speeds with bench
and so determine which binary type is optimal for the user CPU ?
The way to distinguish between which exe is the fastest is to run each one and compare 'nodes per second' (nps
) values. One way to do this is give the engine the command bench
and observe the final nps
value. Please let me know if you have any issues in doing this. It's a quick process and I don't think a script is necessary.
i compiled your latest v10.20.5 (source is git clone) and ran bench
on all of them :
Halogen-x64-pext-avx2
27738055 nodes 1351000 nps
Halogen-x64-popcnt-avx2
27738055 nodes 1339000 nps
Halogen-x64-pext
27738055 nodes 1290000 nps
Halogen-x64-popcnt
27738055 nodes 1273000 nps
Halogen-x64-nopopcnt
27738055 nodes 1271000 nps
in the makefile i adjusted the output file names to distiguish : '-default.exe' and '-pgo.exe' are added :
default:
$(CC) $(CFLAGS) $(SRC) $(LIBS) $(POPCNTFLAGS) -o $(EXE)-default.exe
pgo:
rm -f *.gcda
$(CC) -fprofile-generate $(PGOFLAGS) $(SRC) $(LIBS) $(POPCNTFLAGS) -o $(EXE)-pgo.exe
./$(EXE)-pgo.exe bench 12
$(CC) -fprofile-use $(PGOFLAGS) $(SRC) $(LIBS) $(POPCNTFLAGS) -o $(EXE)-pgo.exe
rm -f *.gcda
then bench
gives this result :
Halogen-default.exe
27738055 nodes 2200000 nps
Halogen-pgo.exe
27738055 nodes 2336000 nps
so, the 5 release versions are considerably slower then the other 2 !?
the pgo
binary is the fastest on my CPU.
I'm surprised the release versions are much slower than the other two, usually the difference is not that great. Will investigate further. In the meantime enjoy using Halogen.
tnx .. glad to help .. i'm also surprised of the nps results .. since my new laptop, which can do avx2 and bmi2 (i don't know what this really means) i read those archs should be fastest ..
maybe the real speed of avx2 / bmi2 also depends on the type of C functions you use to get optimal performance ? I'm not into that.
Results on my computer:
Halogen-default.exe
27738055 nodes 3468000 nps
Halogen-pgo.exe
27738055 nodes 3679000 nps
Halogen-x64-pext-avx2
27738055 nodes 2760000 nps
Halogen-x64-popcnt-avx2
27738055 nodes 3307000 nps
Halogen-x64-pext
27738055 nodes 2544000 nps
Halogen-x64-popcnt
27738055 nodes 2883000 nps
Halogen-x64-nopopcnt
27738055 nodes 2940000 nps
I'm not surprised the default
and pgo
compiles are the fastest. Those use a compiler flag that uses the best instructions for your cpu so they are likely using avx2, bmi as well as others. What version of gcc are you using?
$ gcc --version gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Ok that's relatively modern. I don't know why the static (make release
) compiles for you are so much slower than the native compiles.
this log may help you:
$ sudo lshw
description: Tablet
product: HP Elite x2 1012 G2 (1LV39EA#ABH)
vendor: HP
serial: 5CG9194XS2
width: 64 bits
capabilities: smbios-3.0.0 dmi-3.0.0 smp vsyscall32
configuration: administrator_password=disabled boot=normal chassis=tablet family=103C_5336AN HP Elite x2 frontpanel_password=disabled keyboard_password=disabled power-on_password=disabled sku=1LV39EA#ABH uuid=4ED672E1-7A16-BB2A-746F-7EABCF85304C
*-core
description: Motherboard
product: 82CA
vendor: HP
physical id: 0
version: KBC Version 50.69
serial: PGHRA00WBC9054
*-memory
description: System Memory
physical id: 0
slot: System board or motherboard
size: 8GiB
*-bank:0
description: Row of chips LPDDR3 Synchronous Unbuffered (Unregistered) 1867 MHz (0,5 ns)
product: 825632-382
vendor: Hynix/Hyundai
physical id: 0
serial: 00000000
slot: Bottom-OnBoard 1
size: 4GiB
width: 64 bits
clock: 1867MHz (0.5ns)
*-bank:1
description: Row of chips LPDDR3 Synchronous Unbuffered (Unregistered) 1867 MHz (0,5 ns)
product: 825632-382
vendor: Hynix/Hyundai
physical id: 1
serial: 00000000
slot: Bottom-OnBoard 2
size: 4GiB
width: 64 bits
clock: 1867MHz (0.5ns)
*-firmware
description: BIOS
vendor: HP
physical id: 4
version: P87 Ver. 01.25
date: 01/06/2019
size: 64KiB
capabilities: pci pcmcia upgrade shadowing cdboot bootselect edd int5printscreen int9keyboard int14serial int17printer acpi usb smartbattery biosbootspecification netboot uefi
*-cache:0
description: L1 cache
physical id: a
slot: L1 Cache
size: 128KiB
capacity: 128KiB
capabilities: synchronous internal write-back unified
configuration: level=1
*-cache:1
description: L2 cache
physical id: b
slot: L2 Cache
size: 512KiB
capacity: 512KiB
capabilities: synchronous internal write-back unified
configuration: level=2
*-cache:2
description: L3 cache
physical id: c
slot: L3 Cache
size: 3MiB
capacity: 3MiB
capabilities: synchronous internal write-back unified
configuration: level=3
*-cpu
description: CPU
product: Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz
vendor: Intel Corp.
physical id: d
bus info: cpu@0
version: Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz
serial: To Be Filled By O.E.M.
slot: U3E1
size: 2988MHz
capacity: 3100MHz
width: 64 bits
clock: 100MHz
capabilities: lm fpu fpu_exception wp vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp x86-64 constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d cpufreq
configuration: cores=2 enabledcores=2 threads=4
*-pci
description: Host bridge
product: Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers
vendor: Intel Corporation
physical id: 100
bus info: pci@0000:00:00.0
version: 02
width: 32 bits
clock: 33MHz
configuration: driver=skl_uncore
resources: irq:0
*-display
description: VGA compatible controller
product: HD Graphics 620
vendor: Intel Corporation
physical id: 2
bus info: pci@0000:00:02.0
logical name: /dev/fb0
version: 02
width: 64 bits
clock: 33MHz
capabilities: pciexpress msi pm vga_controller bus_master cap_list rom fb
configuration: depth=32 driver=i915 latency=0 mode=2736x1824 visual=truecolor xres=2736 yres=1824
resources: iomemory:1f0-1ef iomemory:1f0-1ef irq:150 memory:1ff2000000-1ff2ffffff memory:1fc0000000-1fcfffffff ioport:3000(size=64) memory:c0000-dffff
*-generic:0
description: Signal processing controller
product: Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Thermal Subsystem
vendor: Intel Corporation
physical id: 4
bus info: pci@0000:00:04.0
version: 02
width: 64 bits
clock: 33MHz
capabilities: msi pm cap_list
configuration: driver=proc_thermal latency=0
resources: iomemory:1f0-1ef irq:16 memory:1ff3420000-1ff3427fff
*-multimedia:0 UNCLAIMED
description: Multimedia controller
product: Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Imaging Unit
vendor: Intel Corporation
physical id: 5
bus info: pci@0000:00:05.0
version: 01
width: 64 bits
clock: 33MHz
capabilities: msi pm cap_list
configuration: latency=0
resources: iomemory:1f0-1ef memory:1ff3000000-1ff33fffff
*-generic:1
description: Non-VGA unclassified device
product: Sunrise Point-LP Integrated Sensor Hub
vendor: Intel Corporation
physical id: 13
bus info: pci@0000:00:13.0
version: 21
width: 64 bits
clock: 33MHz
capabilities: pm bus_master cap_list
configuration: driver=intel_ish_ipc latency=0
resources: iomemory:1f0-1ef irq:20 memory:1ff3432000-1ff3432fff
*-usb
description: USB controller
product: Sunrise Point-LP USB 3.0 xHCI Controller
vendor: Intel Corporation
physical id: 14
bus info: pci@0000:00:14.0
version: 21
width: 64 bits
clock: 33MHz
capabilities: pm msi xhci bus_master cap_list
configuration: driver=xhci_hcd latency=0
resources: irq:124 memory:d6400000-d640ffff
*-usbhost:0
product: xHCI Host Controller
vendor: Linux 5.4.0-89-generic xhci-hcd
physical id: 0
bus info: usb@1
logical name: usb1
version: 5.04
capabilities: usb-2.00
configuration: driver=hub slots=12 speed=480Mbit/s
*-usb:0
description: USB hub
product: USB 2.0 Hub
vendor: Terminus Technology Inc.
physical id: 1
bus info: usb@1:1
version: 1.11
capabilities: usb-2.00
configuration: driver=hub maxpower=100mA slots=4 speed=480Mbit/s
*-usb
description: Mouse
product: USB Optical Mouse
vendor: PixArt
physical id: 3
bus info: usb@1:1.3
version: 1.00
capabilities: usb-1.10
configuration: driver=usbhid maxpower=100mA speed=1Mbit/s
*-usb:1
description: USB hub
vendor: Microchip Technology, Inc. (formerly SMSC)
physical id: 3
bus info: usb@1:3
version: 0.00
capabilities: usb-2.00
configuration: driver=hub maxpower=2mA slots=1 speed=480Mbit/s
*-usb
description: Mouse
product: Alps Touchpad
vendor: Alps
physical id: 1
bus info: usb@1:3.1
version: 6.10
capabilities: usb-2.00
configuration: driver=usbhid maxpower=30mA speed=12Mbit/s
*-usb:2
description: Communication device
product: HP lt4132 LTE/HSPA+ 4G Module
vendor: HP Inc.
physical id: 5
bus info: usb@1:5
version: 1.02
serial: 0123456789ABCDEF
capabilities: usb-2.00 ethernet
configuration: driver=option maxpower=2mA speed=480Mbit/s
*-usb:3 UNCLAIMED
description: Generic USB device
vendor: Validity Sensors, Inc.
physical id: 6
bus info: usb@1:6
version: 1.64
serial: bc2ab486464c
capabilities: usb-2.00
configuration: maxpower=100mA speed=12Mbit/s
*-usb:4
description: Bluetooth wireless interface
vendor: Intel Corp.
physical id: 9
bus info: usb@1:9
version: 0.10
capabilities: bluetooth usb-2.00
configuration: driver=btusb maxpower=100mA speed=12Mbit/s
*-usbhost:1
product: xHCI Host Controller
vendor: Linux 5.4.0-89-generic xhci-hcd
physical id: 1
bus info: usb@2
logical name: usb2
version: 5.04
capabilities: usb-3.00
configuration: driver=hub slots=6 speed=5000Mbit/s
*-generic:2
description: Signal processing controller
product: Sunrise Point-LP Thermal subsystem
vendor: Intel Corporation
physical id: 14.2
bus info: pci@0000:00:14.2
version: 21
width: 64 bits
clock: 33MHz
capabilities: pm msi cap_list
configuration: driver=intel_pch_thermal latency=0
resources: iomemory:1f0-1ef irq:18 memory:1ff3431000-1ff3431fff
*-multimedia:1
description: Multimedia controller
product: Intel Corporation
vendor: Intel Corporation
physical id: 14.3
bus info: pci@0000:00:14.3
version: 01
width: 64 bits
clock: 33MHz
capabilities: msi pm bus_master cap_list
configuration: driver=ipu3-cio2 latency=64
resources: iomemory:1f0-1ef irq:146 memory:1ff3410000-1ff341ffff
*-generic:3
description: Signal processing controller
product: Sunrise Point-LP Serial IO I2C Controller #0
vendor: Intel Corporation
physical id: 15
bus info: pci@0000:00:15.0
version: 21
width: 64 bits
clock: 33MHz
capabilities: pm bus_master cap_list
configuration: driver=intel-lpss latency=0
resources: iomemory:1f0-1ef irq:16 memory:1ff3430000-1ff3430fff
*-generic:4
description: Signal processing controller
product: Sunrise Point-LP Serial IO I2C Controller #2
vendor: Intel Corporation
physical id: 15.2
bus info: pci@0000:00:15.2
version: 21
width: 64 bits
clock: 33MHz
capabilities: pm bus_master cap_list
configuration: driver=intel-lpss latency=0
resources: iomemory:1f0-1ef irq:18 memory:1ff342f000-1ff342ffff
*-generic:5
description: Signal processing controller
product: Sunrise Point-LP Serial IO I2C Controller #3
vendor: Intel Corporation
physical id: 15.3
bus info: pci@0000:00:15.3
version: 21
width: 64 bits
clock: 33MHz
capabilities: pm bus_master cap_list
configuration: driver=intel-lpss latency=0
resources: iomemory:1f0-1ef irq:19 memory:1ff342e000-1ff342efff
*-communication
description: Communication controller
product: Sunrise Point-LP CSME HECI #1
vendor: Intel Corporation
physical id: 16
bus info: pci@0000:00:16.0
version: 21
width: 64 bits
clock: 33MHz
capabilities: pm msi bus_master cap_list
configuration: driver=mei_me latency=0
resources: iomemory:1f0-1ef irq:149 memory:1ff342d000-1ff342dfff
*-pci:0
description: PCI bridge
product: Sunrise Point-LP PCI Express Root Port #1
vendor: Intel Corporation
physical id: 1c
bus info: pci@0000:00:1c.0
version: f1
width: 32 bits
clock: 33MHz
capabilities: pci pciexpress msi pm normal_decode bus_master cap_list
configuration: driver=pcieport
resources: irq:120 ioport:4000(size=4096) memory:c0000000-d60fffff ioport:1fd0000000(size=570425344)
*-pci:1
description: PCI bridge
product: Sunrise Point-LP PCI Express Root Port #5
vendor: Intel Corporation
physical id: 1c.4
bus info: pci@0000:00:1c.4
version: f1
width: 32 bits
clock: 33MHz
capabilities: pci pciexpress msi pm normal_decode bus_master cap_list
configuration: driver=pcieport
resources: irq:121 memory:d6300000-d63fffff
*-network
description: Wireless interface
product: Wireless 8265 / 8275
vendor: Intel Corporation
physical id: 0
bus info: pci@0000:3a:00.0
logical name: wlp58s0
version: 78
serial: 04:ea:56:eb:ef:33
width: 64 bits
clock: 33MHz
capabilities: pm msi pciexpress bus_master cap_list ethernet physical wireless
configuration: broadcast=yes driver=iwlwifi driverversion=5.4.0-89-generic firmware=36.77d01142.0 ip=192.168.2.3 latency=0 link=yes multicast=yes wireless=IEEE 802.11
resources: irq:151 memory:d6300000-d6301fff
*-pci:2
description: PCI bridge
product: Sunrise Point-LP PCI Express Root Port #6
vendor: Intel Corporation
physical id: 1c.5
bus info: pci@0000:00:1c.5
version: f1
width: 32 bits
clock: 33MHz
capabilities: pci pciexpress msi pm normal_decode bus_master cap_list
configuration: driver=pcieport
resources: irq:122 ioport:5000(size=4096) memory:d6200000-d62fffff ioport:1c00000000(size=2097152)
*-generic
description: Unassigned class
product: RTS522A PCI Express Card Reader
vendor: Realtek Semiconductor Co., Ltd.
physical id: 0
bus info: pci@0000:3b:00.0
version: 01
width: 32 bits
clock: 33MHz
capabilities: pm msi pciexpress bus_master cap_list
configuration: driver=rtsx_pci latency=0
resources: irq:125 memory:d6200000-d6200fff
*-pci:3
description: PCI bridge
product: Sunrise Point-LP PCI Express Root Port #9
vendor: Intel Corporation
physical id: 1d
bus info: pci@0000:00:1d.0
version: f1
width: 32 bits
clock: 33MHz
capabilities: pci pciexpress msi pm normal_decode bus_master cap_list
configuration: driver=pcieport
resources: irq:123 memory:d6100000-d61fffff
*-storage
description: Non-Volatile memory controller
product: NVMe SSD Controller SM981/PM981/PM983
vendor: Samsung Electronics Co Ltd
physical id: 0
bus info: pci@0000:3c:00.0
version: 00
width: 64 bits
clock: 33MHz
capabilities: storage pm msi pciexpress msix nvm_express bus_master cap_list
configuration: driver=nvme latency=0
resources: irq:16 memory:d6100000-d6103fff
*-nvme0
description: NVMe device
product: SAMSUNG MZVLB256HAHQ-000H1
physical id: 0
logical name: /dev/nvme0
version: EXD70H1Q
serial: S425NE0M448669
configuration: nqn=nqn.2014.08.org.nvmexpress:144d144dS425NE0M448669 SAMSUNG MZVLB256HAHQ-000H1 state=live
*-namespace
description: NVMe namespace
physical id: 1
logical name: /dev/nvme0n1
size: 238GiB (256GB)
capabilities: gpt-1.00 partitioned partitioned:gpt
configuration: guid=9fe285cd-2cf0-4f6a-befc-697a1b97ce04 logicalsectorsize=512 sectorsize=512
*-volume:0
description: Windows FAT volume
vendor: mkfs.fat
physical id: 1
logical name: /dev/nvme0n1p1
logical name: /boot/efi
version: FAT32
serial: 794b-bf8a
size: 510MiB
capacity: 511MiB
capabilities: boot fat initialized
configuration: FATs=2 filesystem=fat mount.fstype=vfat mount.options=rw,relatime,fmask=0077,dmask=0077,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro name=EFI System Partition state=mounted
*-volume:1
description: EXT4 volume
vendor: Linux
physical id: 2
logical name: /dev/nvme0n1p2
logical name: /
version: 1.0
serial: 5eadddb1-8a98-4ce9-8ec3-7379ad33ffd6
size: 237GiB
capabilities: journaled extended_attributes large_files huge_files dir_nlink recover 64bit extents ext4 ext2 initialized
configuration: created=2021-09-01 23:58:44 filesystem=ext4 lastmountpoint=/ modified=2021-10-31 02:57:46 mount.fstype=ext4 mount.options=rw,relatime,errors=remount-ro mounted=2021-10-31 02:57:48 state=mounted
*-isa
description: ISA bridge
product: Sunrise Point LPC Controller/eSPI Controller
vendor: Intel Corporation
physical id: 1f
bus info: pci@0000:00:1f.0
version: 21
width: 32 bits
clock: 33MHz
capabilities: isa bus_master
configuration: latency=0
*-memory UNCLAIMED
description: Memory controller
product: Sunrise Point-LP PMC
vendor: Intel Corporation
physical id: 1f.2
bus info: pci@0000:00:1f.2
version: 21
width: 32 bits
clock: 33MHz (30.3ns)
configuration: latency=0
resources: memory:d6410000-d6413fff
*-multimedia:2
description: Audio device
product: Sunrise Point-LP HD Audio
vendor: Intel Corporation
physical id: 1f.3
bus info: pci@0000:00:1f.3
version: 21
width: 64 bits
clock: 33MHz
capabilities: pm msi bus_master cap_list
configuration: driver=snd_hda_intel latency=64
resources: iomemory:1f0-1ef iomemory:1f0-1ef irq:152 memory:1ff3428000-1ff342bfff memory:1ff3400000-1ff340ffff
*-serial
description: SMBus
product: Sunrise Point-LP SMBus
vendor: Intel Corporation
physical id: 1f.4
bus info: pci@0000:00:1f.4
version: 21
width: 64 bits
clock: 33MHz
configuration: driver=i801_smbus latency=0
resources: iomemory:1f0-1ef irq:16 memory:1ff342c000-1ff342c0ff ioport:efa0(size=32)
*-pnp00:00
product: PnP device PNP0c02
physical id: 1
capabilities: pnp
configuration: driver=system
*-pnp00:01
product: PnP device PNP0c02
physical id: 2
capabilities: pnp
configuration: driver=system
*-pnp00:02
product: PnP device PNP0c02
physical id: 3
capabilities: pnp
configuration: driver=system
*-pnp00:03
product: PnP device PNP0b00
physical id: 5
capabilities: pnp
configuration: driver=rtc_cmos
*-pnp00:04
product: PnP device INT3f0d
physical id: 6
capabilities: pnp
configuration: driver=system
*-pnp00:05
product: PnP device HPQ8002
physical id: 7
capabilities: pnp
configuration: driver=i8042 kbd
*-pnp00:06
product: PnP device ALP0110
physical id: 8
capabilities: pnp
configuration: driver=i8042 aux
*-pnp00:07
product: PnP device PNP0c02
physical id: 9
capabilities: pnp
configuration: driver=system
*-pnp00:08
product: PnP device PNP0c02
physical id: e
capabilities: pnp
configuration: driver=system
*-pnp00:09
product: PnP device PNP0c02
physical id: f
capabilities: pnp
configuration: driver=system
*-pnp00:0a
product: PnP device PNP0c02
physical id: 10
capabilities: pnp
configuration: driver=system
*-battery
product: JI04047XL
vendor: 333-2C-0E-A
physical id: 1
slot: Primary
capacity: 47040mWh
configuration: voltage=7,7V
*-network
description: Ethernet interface
physical id: 2
bus info: usb@1:5
logical name: enx021e101f0000
serial: 02:1e:10:1f:00:00
capabilities: ethernet physical
configuration: broadcast=yes driver=cdc_ether driverversion=22-Aug-2005 firmware=CDC Ethernet Device link=no multicast=yes
one extra info : the files sizes of these binaries also differ a lot :
Closing this for now as we determined that make pgo
was the best option.
i just compiled your newest Halogen v10.18.1 but the make process is not clear to me .. your readme does not state anything about the make process, so i guess just 'make' should do .. indeed it gives me a working binary of 1.7 Mb .. however, it shows 'POPCNT' while my notebook can do BMI2 and AVX2, which i prefer .. the Makefile shows bmi and avx are supported when doing 'make release' : it gives 5 binaries for 5 different ARCHs, which all work, but they have an extension '.exe' which points to a Windows executable, but these binaries DO run on Linux !? I tried 'make help' to see your compilation options / arguments but none are shown .. why is your Makefile not documented ?