vlang / v

Simple, fast, safe, compiled language for developing maintainable software. Compiles itself in <1s with zero library dependencies. Supports automatic C => V translation. https://vlang.io
MIT License
35.66k stars 2.15k forks source link

Performance Issues with time.now? #19554

Open xiusin opened 11 months ago

xiusin commented 11 months ago

Describe the bug

module cache

[heap]
struct Test {
mut:
    created_at time.Time
}

fn test_map() {
    mut b := benchmark.start()
    fn [mut b] () {
        mut m := map[string]&Test{}
        mut i := 0
        for {
            if i > 10000000 {
                break
            }

            m['test_${i}'] = &Test{}
            i++
        }

        b.measure('Test no default')
    }()

    fn [mut b] () {
        mut m := map[string]&Test{}
        mut i := 0
        for {
            if i > 10000000 {
                break
            }

            m['test_${i}'] = &Test{created_at: time.now()}
            i++
        }

        b.measure('Test default')
    }()
}

GC Warning: Repeated allocation of very large block (appr. size 162746368):
    May lead to memory leak and poor performance
SPENT 9076.469 ms in Test no default
SPENT90380.174 ms in Test default

Reproduction Steps

As shown above

Expected Behavior

High Performance

Current Behavior

Initialization makes the stress testing program slow by nearly 10x

Possible Solution

No response

Additional Information/Context

No response

V version

V 0.4.2 2332c17

Environment details (OS name and version, etc.)

V full version: V 0.4.2 b9a1b0c.2332c17
OS: macos, macOS, 13.5.1, 22G90
Processor: 8 cpus, 64bit, little endian, Intel(R) Core(TM) i5-8279U CPU @ 2.40GHz

[!NOTE] You can vote for this issue using the 👍 reaction. More votes increase the issue's priority for developers.

Take into account that only the 👍 reaction counts as a vote. Only reactions to the issue itself will be counted as votes, not comments.

JalonSolov commented 10 months ago

A complete, fully runnable example:

import benchmark
import time

[heap]
struct Test {
mut:
        created_at time.Time
}

fn main() {
        mut b := benchmark.start()
        fn [mut b] () {
                mut m := map[string]&Test{}
                for i in 0..10000000 {
                        m['test_${i}'] = &Test{}
                }

                b.measure('Test no default')
        }()

        fn [mut b] () {
                mut m := map[string]&Test{}
                for i in 0..10000000 {
                        m['test_${i}'] = &Test{created_at: time.now()}
                }

                b.measure('Test default')
        }()
}

Output for me is


[jalon@7950x ~]$ v run time.v
GC Warning: Repeated allocation of very large block (appr. size 63430656):
        May lead to memory leak and poor performance
GC Warning: Repeated allocation of very large block (appr. size 134221824):
        May lead to memory leak and poor performance
 SPENT  7160.987 ms in Test no default
GC Warning: Repeated allocation of very large block (appr. size 114302976):
        May lead to memory leak and poor performance
 SPENT 15344.580 ms in Test default
[jalon@7950x ~]$
xiusin commented 10 months ago

@JalonSolov My computer is a MacBook produced in mid-2018, and the performance gap is quite evident.

medvednikov commented 10 months ago

M2:

 SPENT  7868.688 ms in Test no default
 SPENT 47074.378 ms in Test default

In Go the difference is only 0.5s, so there's definitely something wrong with time.now()

Good find @xiusin

Will fix

ttytm commented 4 months ago

Testing again with latest V, I can reproduce the slow down of about 10x on an m1 macbook. On an x64 linux machine it's about 2x, independent of the used compiler and if running the prog with or without -prod.

JalonSolov commented 4 months ago

There are at least a couple of "extra" things happening here. For one, the example code is creating maps and adding to them in the loop. After you take out the "excess" code, so that you come closer to only testing the time code...

import benchmark
import time

struct Test {
mut:
        created_at time.Time
}

fn main() {
        mut b := benchmark.start()

        fn [mut b] () {
                for _ in 0 .. 10000000 {
                        _ = Test{}
                }

                b.measure('Test no default')
        }()

        fn [mut b] () {
                for _ in 0 .. 10000000 {
                        _ = Test{
                                created_at: time.now()
                        }
                }

                b.measure('Test default')
        }()
}

output for me (on linux, without -prod) is

 SPENT    76.368 ms in Test no default
 SPENT  1195.009 ms in Test default

and with -prod is

 SPENT     0.004 ms in Test no default
 SPENT   740.276 ms in Test default

The 2nd thing is that the 2nd loop is creating an initialized time.Time, while the first is creating an "empty" time.Time. So... extra code for initialization definitely will take longer.

ttytm commented 4 months ago

Running with a 1mil instead of a 10mil loop

Linux x64

# -cc tcc (6-7x diff)
 SPENT    45.239 ms in Test no default
 SPENT   352.716 ms in Test default
# -cc gcc -prod
 SPENT     0.005 ms in Test no default
 SPENT   106.653 ms in Test default

macOS m1

# -cc tcc (600-700x diff)
 SPENT     5.058 ms in Test no default
 SPENT  3200.153 ms in Test default
# -cc clang -prod
 SPENT     0.005 ms in Test no default
 SPENT  3095.827 ms in Test default
JalonSolov commented 4 months ago

Definitely something odd with m1 mac. The time for initialization should be much smaller. Weird.

ttytm commented 4 months ago

In general with mac I think. The differences in the initial issue description (mac intel) and on alex m2 in comparison with linux are significant too.