prometheus / procfs

procfs provides functions to retrieve system, kernel and process metrics from the pseudo-filesystem proc.
Apache License 2.0
769 stars 319 forks source link

Enable oppportunistic fd counting fast path #486

Closed bobrik closed 1 year ago

bobrik commented 1 year ago

Existing slow path takes ~725ms of CPU time on my laptopto count 1 million open files. Compare this to just 0.25ms for the baseline, when no extra files are open by a Go program:

Open files reported: 7
 Gathered metrics in 0.26ms
Open files reported: 1000007
 Gathered metrics in 724.50ms

Adding fastpath from Linux v6.2 makes it fast:

Open files reported: 6
 Gathered metrics in 0.29ms
Open files reported: 1000006
 Gathered metrics in 0.31ms

This is before taking in account any lock contention effects in the kernel if you try to count files from multiple threads concurrently, which makes the slow path even slower, burning a lot more CPU in the process. See:

The code I used ```go package main import ( "fmt" "os" "time" "github.com/prometheus/client_golang/prometheus" ) func main() { run() openLotsOfFiles(1000000) run() } func run() { s := time.Now() metrics, err := prometheus.DefaultGatherer.Gather() if err != nil { panic(err) } for _, metric := range metrics { if metric.GetName() == "process_open_fds" { fmt.Printf("Open files reported: %.0f\n", *metric.Metric[0].Gauge.Value) } } fmt.Printf(" Gathered metrics in %.2fms\n", float64(time.Since(s).Microseconds())/1000) } func openLotsOfFiles(lots int) []*os.File { files := make([]*os.File, lots) for i := 0; i < lots; i++ { file, err := os.Open("/etc/hosts") if err != nil { panic(err) } files[i] = file } return files } ``` ``` go build -o /tmp/lol . ``` ``` for i in $(seq 1 5); do GOMAXPROCS=1 /tmp/lol; done ```
bobrik commented 1 year ago

Tests are unhappy:

--- FAIL: TestFileDescriptorsLen (0.00s)
    proc_test.go:244: want fds 5, have 224

That's because they make FileDescriptorsLen look at the Stat() result for a real directory, which behaves differently from /proc/pid/fd. I'm open to suggestions on how to address this in tests.

ivan@vm:~/projects/prometheus-procfs$ stat testdata/fixtures/proc/26231/fd | head -2
  File: testdata/fixtures/proc/26231/fd
  Size: 224         Blocks: 0          IO Block: 1048576 directory
ivan@vm:~/projects/prometheus-procfs$ ls testdata/fixtures/proc/26231/fd | wc -l
5
discordianfish commented 1 year ago

No clear recommendation for the tests.. Might need to restructure this to fix it.

bobrik commented 1 year ago

I added a commit with a possible solution. I'm not sure if it's a good one, but I can't think of anything better.

Initially I hoped for a procfs detection in stat() syscall result itself, but it doesn't seem possible. Adding another syscall to double check whether stat() result can be trusted doesn't seem right either, as we only need to learn this fact once per mountpoint.