Closed forchid closed 2 days ago
Profilers exist for a reason. I cannot replicate the 3 times slower calculation here. I get 4s for C++ and 5.5s for Nim
Profilers exist for a reason. I cannot replicate the 3 times slower calculation here. I get 4s for C++ and 5.5s for Nim
My test is on Windows 10 x64.
Well, you can always look at the generated assembly and tell us 😄 Probably different compiler-level optimisations get triggered in the C++ vs Nim code. I'd bet the unoptimised timings are much closer.
Well, you can always look at the generated assembly and tell us 😄 Probably different compiler-level optimisations get triggered in the C++ vs Nim code. I'd bet the unoptimised timings are much closer.
The nim test source has used the compiler flag -d:release
.
main
proc.-d:danger
.(Both of these tips are widely known.)
- Use a
main
proc.- Compile with
-d:danger
.(Both of these tips are widely known.)
It takes 33s+ by this way. Both of these tips are invalid for this test case!
import std/[times]
proc main() =
var findcount = 0;
const start = cpuTime()
for i in (2..300000):
for n in (2..i):
if n == i:
findcount = findcount + 1
if (i mod n) == 0 and n < i:
break
echo "Nim: findcount is " & $findcount & "(time " & $(cpuTime() - start) & "s)"
# Run it
main()
>nim c -d:release primev2.nim
Hint: used config file '..\nim-1.6.20\config\nim.cfg' [Conf]
Hint: used config file '..\nim-1.6.20\config\config.nims' [Conf]
................................................................................
CC: ../../../../Programs/nim-1.6.20/lib/std/private/digitsutils.nim
CC: ../../../../Programs/nim-1.6.20/lib/system/formatfloat.nim
CC: ../../../../Programs/nim-1.6.20/lib/system/dollars.nim
CC: ../../../../Programs/nim-1.6.20/lib/system/io.nim
CC: ../../../../Programs/nim-1.6.20/lib/system.nim
CC: ../../../../Programs/nim-1.6.20/lib/pure/dynlib.nim
CC: ../../../../Programs/nim-1.6.20/lib/windows/winlean.nim
CC: ../../../../Programs/nim-1.6.20/lib/pure/times.nim
CC: primev2.nim
Hint: [Link]
Hint: gc: refc; opt: speed; options: -d:release
43869 lines; 4.813s; 60.77MiB peakmem; proj: ..\primev2.nim; out: ..\primev2.exe [SuccessX]
hello>primev2
Nim: findcount is 25997(time 33.964s)
hello>nim c -d:danger primev2.nim
Hint: used config file '...\nim-1.6.20\config\nim.cfg' [Conf]
Hint: used config file '...\nim-1.6.20\config\config.nims' [Conf]
................................................................................
CC: ../../../../Programs/nim-1.6.20/lib/std/private/digitsutils.nim
CC: ../../../../Programs/nim-1.6.20/lib/system/dollars.nim
CC: ../../../../Programs/nim-1.6.20/lib/system/io.nim
CC: ../../../../Programs/nim-1.6.20/lib/system.nim
CC: ../../../../Programs/nim-1.6.20/lib/pure/times.nim
CC: primev2.nim
Hint: [Link]
Hint: gc: refc; opt: speed; options: -d:danger
43869 lines; 3.563s; 60.719MiB peakmem; proj: ..\primev2.nim; out: ..\primev2.exe [SuccessX]
hello>primev2
Nim: findcount is 25997(time 33.557s)
It seems to be a tiny bit faster with -d:danger
so my tip is valid. There is still something else going wrong, yes.
On my Ubuntu 22.04 (x86_64) laptop, The default size of integer type is the root cause.
On my Ubuntu 22.04 (x86_64) laptop, The default size of integer type is the root cause.
Sure. When using int32 and the flag -d:danger
, this test only takes 10.217s!
# primev3.nim: using int32
import std/[times]
proc main() =
var findcount = int32(0);
const start = cpuTime()
for i in (int32(2) .. int32(300000)):
for n in (int32(2) .. i):
if n == i:
findcount = findcount + int32(1)
if (i mod n) == int32(0) and n < i:
break
echo "Nim: findcount is " & $findcount & "(time " & $(cpuTime() - start) & "s)"
# Run it
main()
hello>nim c -d:danger primev3.nim
Hint: used config file '..\nim-1.6.20\config\nim.cfg' [Conf]
Hint: used config file '..\nim-1.6.20\config\config.nims' [Conf]
................................................................................
CC: ../../../../Programs/nim-1.6.20/lib/std/private/digitsutils.nim
CC: ../../../../Programs/nim-1.6.20/lib/system/dollars.nim
CC: ../../../../Programs/nim-1.6.20/lib/system/io.nim
CC: ../../../../Programs/nim-1.6.20/lib/system.nim
CC: ../../../../Programs/nim-1.6.20/lib/pure/times.nim
CC: primev3.nim
Hint: [Link]
Hint: gc: refc; opt: speed; options: -d:danger
43870 lines; 3.832s; 60.836MiB peakmem; proj: ..\hello\primev3.nim; out: ..\hello\primev3.exe [SuccessX]
..\hello>primev3
Nim: findcount is 25997(time 10.217s)
I reproduced the issue on win10. When I copy the code generated by nim and the cpp code into the same file and compile it, the output is basically the same.
// g++ p.cpp -O2
Nim: findcount is 25997(time 11s)
C++: findcount is 25997(time 10s)
#include <iostream>
#include <chrono>
#include <time.h>
using namespace std::chrono;
using NI = int;
using NIM_BOOL = int;
float nimtime()
{
return clock() / CLOCKS_PER_SEC;
}
void t1()
{
auto start = nimtime();
NI res = 0;
NI findcount__p_2 = ((NI) 0);
NI i__p_8 = 0;
NI n__p_13 = 0;
res = ((NI) 2);
{
while (1) {
if (!(res <= ((NI) 300000))) goto LA3;
i__p_8 = res;
{
NI res_2;
n__p_13 = 0;
res_2 = ((NI) 2);
{
while (1) {
if (!(res_2 <= i__p_8)) goto LA6;
n__p_13 = res_2;
{
if (!(n__p_13 == i__p_8)) goto LA9_;
findcount__p_2 = (NI)(findcount__p_2 + ((NI) 1));
}
LA9_: ;
{
NIM_BOOL T13_;
T13_ = (NIM_BOOL)0;
T13_ = ((NI)(i__p_8 % n__p_13) == ((NI) 0));
if (!(T13_)) goto LA14_;
T13_ = (n__p_13 < i__p_8);
LA14_: ;
if (!T13_) goto LA15_;
goto LA4;
}
LA15_: ;
res_2 += ((NI) 1);
} LA6: ;
}
} LA4: ;
res += ((NI) 1);
} LA3: ;
}
auto end = nimtime();
auto du = (end - start);
std::cout << "Nim: findcount is "<< findcount__p_2
<< "(time " << du << "s)" << std::endl;
}
void t2()
{
int findcount = 0;
auto start = nimtime();
for (int i = 2; i <= 300000; i++) {
for (int n = 2; n <= i; n++) {
if (n == i) {
findcount = findcount + 1;
}
if ((i % n == 0) && (n < i)) {
break;
}
}
}
auto end = nimtime();
auto du = (end - start);
std::cout << "C++: findcount is "<< findcount
<< "(time " << du << "s)" << std::endl;
}
int main(int argc, char *argv[])
{
t1();
t2();
return 0;
}
So ... solved.
Description
The nim speed is 3 times slower than c++ for finding the count of prime number from 2 to 300000. Nim takes 35s+, but c++ only uses 10s. Why nim so slow for number calculation?
The nim source
The C++ source
Nim Version
active boot switches: -d:release
hello>nim -v Nim Compiler Version 1.6.20 [Windows: amd64] Compiled at 2024-04-07 Copyright (c) 2006-2023 by Andreas Rumpf
active boot switches: -d:release
Current Output
Expected Output
Known Workarounds
No response
Additional Information
No response