nim-lang / Nim

Nim is a statically typed compiled systems programming language. It combines successful concepts from mature languages like Python, Ada and Modula. Its design focuses on efficiency, expressiveness, and elegance (in that order of priority).
https://nim-lang.org
Other
16.59k stars 1.47k forks source link

Nim too slow for finding the count of prime number from 2 to 300000 than c or c++ #24427

Closed forchid closed 2 days ago

forchid commented 2 days ago

Description

The nim speed is 3 times slower than c++ for finding the count of prime number from 2 to 300000. Nim takes 35s+, but c++ only uses 10s. Why nim so slow for number calculation?

The nim source

import std/[times]

var findcount = 0;

const start = cpuTime()

for i in (2..300000):
    for n in (2..i):
        if n == i:
            findcount = findcount + 1

        if  (i mod n) == 0 and n < i:
            break

echo "Nim: findcount is " & $findcount & "(time " & $(cpuTime() - start) & "s)"

The C++ source

#include <iostream>
#include <chrono>

using namespace std::chrono;

int main(int argc, char *argv[])
{
    int findcount = 0;

    auto start = high_resolution_clock::now();
    for (int i = 2; i <= 300000; i++) {
        for (int n = 2; n <= i; n++) {
            if (n == i) {
                findcount = findcount + 1;
            }
            if ((i % n == 0) && (n < i)) {
                break;
            }
        }
    }
    auto end = high_resolution_clock::now();
    auto du = duration_cast<seconds>(end - start);

    std::cout << "C++: findcount is "<< findcount 
        << "(time " << du.count() << "s)" << std::endl;

    return 0;
}

Nim Version

%nim20_home%\bin\nim -v Nim Compiler Version 2.0.8 [Windows: amd64] Compiled at 2024-07-03 Copyright (c) 2006-2023 by Andreas Rumpf

active boot switches: -d:release

hello>nim -v Nim Compiler Version 1.6.20 [Windows: amd64] Compiled at 2024-04-07 Copyright (c) 2006-2023 by Andreas Rumpf

active boot switches: -d:release

Current Output

>%nim20_home%\bin\nim c -d:release prime.nim
Hint: used config file '...\config\nim.cfg' [Conf]
Hint: used config file '...\config\config.nims' [Conf]
................................................................................................
CC: prime.nim
Hint:  [Link]
Hint: mm: orc; threads: on; opt: speed; options: -d:release
46357 lines; 1.000s; 71.133MiB peakmem; proj: ...\hello\prime.nim; out: hello\prime.exe [SuccessX]

hello>prime
Nim: findcount is 25997(time 36.045s)

hello>g++ -O2 -o prime prime.cc

hello>prime
C++: findcount is 25997(time 10s)

Expected Output

Nim: findcount is 25997(time 10s+)

Known Workarounds

No response

Additional Information

No response

beef331 commented 2 days ago

Profilers exist for a reason. I cannot replicate the 3 times slower calculation here. I get 4s for C++ and 5.5s for Nim

forchid commented 2 days ago

Profilers exist for a reason. I cannot replicate the 3 times slower calculation here. I get 4s for C++ and 5.5s for Nim

My test is on Windows 10 x64.

johnnovak commented 2 days ago

Well, you can always look at the generated assembly and tell us 😄 Probably different compiler-level optimisations get triggered in the C++ vs Nim code. I'd bet the unoptimised timings are much closer.

forchid commented 2 days ago

Well, you can always look at the generated assembly and tell us 😄 Probably different compiler-level optimisations get triggered in the C++ vs Nim code. I'd bet the unoptimised timings are much closer.

The nim test source has used the compiler flag -d:release.

Araq commented 2 days ago

(Both of these tips are widely known.)

forchid commented 2 days ago
  • Use a main proc.
  • Compile with -d:danger.

(Both of these tips are widely known.)

It takes 33s+ by this way. Both of these tips are invalid for this test case!

The source

import std/[times]

proc main() =
    var findcount = 0;

    const start = cpuTime()

    for i in (2..300000):
        for n in (2..i):
            if n == i:
                findcount = findcount + 1

            if  (i mod n) == 0 and n < i:
                break

    echo "Nim: findcount is " & $findcount & "(time " & $(cpuTime() - start) & "s)"

# Run it
main()

The test

>nim c -d:release primev2.nim
Hint: used config file '..\nim-1.6.20\config\nim.cfg' [Conf]
Hint: used config file '..\nim-1.6.20\config\config.nims' [Conf]
................................................................................
CC: ../../../../Programs/nim-1.6.20/lib/std/private/digitsutils.nim
CC: ../../../../Programs/nim-1.6.20/lib/system/formatfloat.nim
CC: ../../../../Programs/nim-1.6.20/lib/system/dollars.nim
CC: ../../../../Programs/nim-1.6.20/lib/system/io.nim
CC: ../../../../Programs/nim-1.6.20/lib/system.nim
CC: ../../../../Programs/nim-1.6.20/lib/pure/dynlib.nim
CC: ../../../../Programs/nim-1.6.20/lib/windows/winlean.nim
CC: ../../../../Programs/nim-1.6.20/lib/pure/times.nim
CC: primev2.nim
Hint:  [Link]
Hint: gc: refc; opt: speed; options: -d:release
43869 lines; 4.813s; 60.77MiB peakmem; proj: ..\primev2.nim; out: ..\primev2.exe [SuccessX]

hello>primev2
Nim: findcount is 25997(time 33.964s)

hello>nim c -d:danger primev2.nim
Hint: used config file '...\nim-1.6.20\config\nim.cfg' [Conf]
Hint: used config file '...\nim-1.6.20\config\config.nims' [Conf]
................................................................................
CC: ../../../../Programs/nim-1.6.20/lib/std/private/digitsutils.nim
CC: ../../../../Programs/nim-1.6.20/lib/system/dollars.nim
CC: ../../../../Programs/nim-1.6.20/lib/system/io.nim
CC: ../../../../Programs/nim-1.6.20/lib/system.nim
CC: ../../../../Programs/nim-1.6.20/lib/pure/times.nim
CC: primev2.nim
Hint:  [Link]
Hint: gc: refc; opt: speed; options: -d:danger
43869 lines; 3.563s; 60.719MiB peakmem; proj: ..\primev2.nim; out: ..\primev2.exe [SuccessX]

hello>primev2
Nim: findcount is 25997(time 33.557s)
Araq commented 2 days ago

It seems to be a tiny bit faster with -d:danger so my tip is valid. There is still something else going wrong, yes.

derekdai commented 2 days ago

On my Ubuntu 22.04 (x86_64) laptop, The default size of integer type is the root cause.

forchid commented 2 days ago

On my Ubuntu 22.04 (x86_64) laptop, The default size of integer type is the root cause.

Sure. When using int32 and the flag -d:danger, this test only takes 10.217s!

The test source

# primev3.nim: using int32
import std/[times]

proc main() =
    var findcount = int32(0);

    const start = cpuTime()

    for i in (int32(2) .. int32(300000)):
        for n in (int32(2) .. i):
            if n == i:
                findcount = findcount + int32(1)

            if  (i mod n) == int32(0) and n < i:
                break

    echo "Nim: findcount is " & $findcount & "(time " & $(cpuTime() - start) & "s)"

# Run it
main()

Running this test

hello>nim c -d:danger primev3.nim
Hint: used config file '..\nim-1.6.20\config\nim.cfg' [Conf]
Hint: used config file '..\nim-1.6.20\config\config.nims' [Conf]
................................................................................
CC: ../../../../Programs/nim-1.6.20/lib/std/private/digitsutils.nim
CC: ../../../../Programs/nim-1.6.20/lib/system/dollars.nim
CC: ../../../../Programs/nim-1.6.20/lib/system/io.nim
CC: ../../../../Programs/nim-1.6.20/lib/system.nim
CC: ../../../../Programs/nim-1.6.20/lib/pure/times.nim
CC: primev3.nim
Hint:  [Link]
Hint: gc: refc; opt: speed; options: -d:danger
43870 lines; 3.832s; 60.836MiB peakmem; proj: ..\hello\primev3.nim; out: ..\hello\primev3.exe [SuccessX]

..\hello>primev3
Nim: findcount is 25997(time 10.217s)
haoyu234 commented 2 days ago

I reproduced the issue on win10. When I copy the code generated by nim and the cpp code into the same file and compile it, the output is basically the same.

//  g++ p.cpp -O2
Nim: findcount is 25997(time 11s)
C++: findcount is 25997(time 10s)
#include <iostream>
#include <chrono>
#include <time.h>

using namespace std::chrono;

using NI = int;
using NIM_BOOL = int;

float nimtime()
{
    return clock() / CLOCKS_PER_SEC;
}

void t1()
{
    auto start = nimtime();

    NI res = 0;
    NI findcount__p_2 = ((NI) 0);
    NI i__p_8 = 0;
    NI n__p_13 = 0;
    res = ((NI) 2);
    {
        while (1) {
            if (!(res <= ((NI) 300000))) goto LA3;
            i__p_8 = res;
            {
                NI res_2;
                n__p_13 = 0;
                res_2 = ((NI) 2);
                {
                    while (1) {
                        if (!(res_2 <= i__p_8)) goto LA6;
                        n__p_13 = res_2;
                        {
                            if (!(n__p_13 == i__p_8)) goto LA9_;
                            findcount__p_2 = (NI)(findcount__p_2 + ((NI) 1));
                        }
                        LA9_: ;
                        {
                            NIM_BOOL T13_;
                            T13_ = (NIM_BOOL)0;
                            T13_ = ((NI)(i__p_8 % n__p_13) == ((NI) 0));
                            if (!(T13_)) goto LA14_;
                            T13_ = (n__p_13 < i__p_8);
                            LA14_: ;
                            if (!T13_) goto LA15_;
                            goto LA4;
                        }
                        LA15_: ;
                        res_2 += ((NI) 1);
                    } LA6: ;
                }
            } LA4: ;
            res += ((NI) 1);
        } LA3: ;
    }

    auto end = nimtime();
    auto du = (end - start);

    std::cout << "Nim: findcount is "<< findcount__p_2 
        << "(time " << du << "s)" << std::endl;
}

void t2()
{
    int findcount = 0;

    auto start = nimtime();

    for (int i = 2; i <= 300000; i++) {
        for (int n = 2; n <= i; n++) {
            if (n == i) {
                findcount = findcount + 1;
            }
            if ((i % n == 0) && (n < i)) {
                break;
            }
        }
    }

    auto end = nimtime();
    auto du = (end - start);

    std::cout << "C++: findcount is "<< findcount 
        << "(time " << du << "s)" << std::endl;
}

int main(int argc, char *argv[])
{
    t1();
    t2();
    return 0;
}
Araq commented 2 days ago

So ... solved.