templexxx / tsc

Get unix time (nanoseconds) in 8ns, 10x faster than stdlib
MIT License
136 stars 5 forks source link

why tsc slower in linux? #6

Closed rfyiamcool closed 3 years ago

rfyiamcool commented 3 years ago

uname

$ uname -a
Linux shark-master-hw 3.10.0-957.5.1.el7.x86_64 #1 SMP Fri Feb 1 14:54:57 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

cpu

$ cat /proc/cpuinfo |grep name
model name  : Intel(R) Xeon(R) Gold 6151 CPU @ 3.00GHz
model name  : Intel(R) Xeon(R) Gold 6151 CPU @ 3.00GHz
model name  : Intel(R) Xeon(R) Gold 6151 CPU @ 3.00GHz
model name  : Intel(R) Xeon(R) Gold 6151 CPU @ 3.00GHz
model name  : Intel(R) Xeon(R) Gold 6151 CPU @ 3.00GHz
model name  : Intel(R) Xeon(R) Gold 6151 CPU @ 3.00GHz
model name  : Intel(R) Xeon(R) Gold 6151 CPU @ 3.00GHz
model name  : Intel(R) Xeon(R) Gold 6151 CPU @ 3.00GHz

bench

goos: linux
goarch: amd64
pkg: k
BenchmarkGenStdTimeStamp-8      19682988            60.7 ns/op
BenchmarkGenTsc-8           18210938            63.6 ns/op
PASS
ok      k   3.155s

test file

import (
        "testing"
        "time"

        "github.com/templexxx/tsc"
)

func BenchmarkGenTimeStamp(b *testing.B) {
        for i := 0; i < b.N; i++ {
                time.Now().UnixNano()
        }
}

func BenchmarkGenTsc(b *testing.B) {
        for i := 0; i < b.N; i++ {
                tsc.UnixNano()
        }
}
templexxx commented 3 years ago

Could you print tsc.Enabled ?

If could not get the tsc frequency, it will just wrap time.Now().UnixNano()

rfyiamcool commented 3 years ago

test file

func BenchmarkGenTimeStamp(b *testing.B) {
        for i := 0; i < b.N; i++ {
                time.Now().UnixNano()
        }
}

func BenchmarkGenTsc(b *testing.B) {
        b.Log("cur: ", tsc.Enabled)
        tsc.Enabled = true
        b.Log("modify: ", tsc.Enabled)

        for i := 0; i < b.N; i++ {
                tsc.UnixNano()
        }
}

result

$ go test -bench .
goos: linux
goarch: amd64
pkg: s
BenchmarkGenTimeStamp-40        20257611                52.6 ns/op
BenchmarkGenTsc-40              20446849                56.4 ns/op
--- BENCH: BenchmarkGenTsc-40
    g_test.go:19: cur:  false
    g_test.go:21: modify:  true
    g_test.go:19: cur:  true
    g_test.go:21: modify:  true
    g_test.go:19: cur:  true
    g_test.go:21: modify:  true
    g_test.go:19: cur:  true
    g_test.go:21: modify:  true
    g_test.go:19: cur:  true
    g_test.go:21: modify:  true
        ... [output truncated]
PASS
ok      s       4.527s
rfyiamcool commented 3 years ago

😁 tsc's latency is abort 10ns in mac os, but tsc is slow in linux, I tried to execute on a different host.

rfyiamcool commented 3 years ago

原来是中国人。。。。

我在机房物理机、阿里和腾讯云的主机都有尝试过,速度都不理想,不知道是否有跟配置有关系。

templexxx commented 3 years ago

There are many limitation of using TSC as a stable clock source, in init() process, we'll check these conditions.

And in your testing, it showed that tsc wasn't enabled. Which means this CPU can't satisify the needs of TSC. That's a pity. :D

rfyiamcool commented 3 years ago

😅 thank u.

templexxx commented 3 years ago

原来是中国人。。。。

我在机房物理机、阿里和腾讯云的主机都有尝试过,速度都不理想,不知道是否有跟配置有关系。

虚拟机的部分 CPUID 指令受限(原因不详,没有探究过),导致无法获取 frequency,我这里做的比较谨慎,直接就判定 tsc 无法按照预期稳定工作了。所以虚拟机上一般干不了这事。

之前试过 AWS 的 metal 好像也不行(可能记错了)

rfyiamcool commented 3 years ago

好的,谢谢。