zsaleeba / picoc

A very small C interpreter
1.45k stars 183 forks source link

ver. 2.2 versus ver. 2.1 #11

Open ghost opened 9 years ago

ghost commented 9 years ago

I've made a superficial timing testing of the captioned versions using an ad-hoc long empty for loop. It resulted that ver. 2.2 is approximately 3 times slower than ver. 2.1 !!! Could you tell me what am I doing wrong ? Step-by-step: I've got ver. 2.1 from Google Code and ver. 2.2 form GitHub. I made the necessary cosmetic changes to unistd in order to get compiled by MinGW32 toolkit. Got both complied (no optimisations) with TDM-GCC 4.8.1 32bit release on Win 7. Run a very simple script containing an empty for and looped 3 million times.

Thank you in advance.

zsaleeba commented 9 years ago

Hi,

I'm not sure offhand why you'd experience such a performance regression. I haven't seen one before and there aren't really that many significant changes between those versions.

Zik Saleeba zik@zikzak.net

On Mon, Jun 22, 2015, at 10:15 PM, mtrif wrote:

I've made a superficial timing testing of the captioned versions using an ad-hoc long empty for loop.

It resulted that ver. 2.2 is approximately 3 times slower than ver. 2.1 !!!

Could you tell me what am I doing wrong ?

Step-by-step:

I've got ver. 2.1 from Google Code and ver. 2.2 form GitHub.

I made the necessary cosmetic changes to unistd in order to get compiled by MinGW32 toolkit.

Got both complied (no optimisations) with TDM-GCC 4.8.1 32bit release on Win 7.

Run a very simple script containing an empty for and looped 3 million times.

Thank you in advance.

— Reply to this email directly or view it on GitHub[1].

Links:

  1. https://github.com/zsaleeba/picoc/issues/11
ghost commented 9 years ago

Hi Zik, I've done something more, this time the test script was an empty while loop, as follows:


include

int Count; int CountMax1 = 3000; int CountMax2 = 3000000; clock_t begin, end; double seconds1, seconds2; printf("\nwhile loop with print\n"); begin = clock(); Count = 1; while (Count <= CountMax1) { printf("%d\n", Count); Count++; } end = clock(); seconds1 = (double)(end - begin) / CLOCKS_PER_SEC; printf("\nwhile loop without print\n"); begin = clock(); Count = 1; while (Count <= CountMax2) { Count++; } end = clock(); seconds2 = (double)(end - begin) / CLOCKS_PER_SEC;

printf("\nwhile loop with printf done %d times (took %f secs)\n", CountMax1, seconds1); printf("\nwhile loop without printf done %d times (took %f secs)\n", CountMax2, seconds2);

void main() {}


The result was marginally better, but version 2.2 is still much slower.

I think I found something, with the help of gprof: in version 2.2 the script spent a lot of time inside VariableScopeBegin and VariableScopeEnd (approx. 50% the whole time). I couldn't attach the output from gprof, hence I had to paste an excerpt (I kept only down to 1%).

I'm sure you will find the proper medicine soon. For the time being I'll keep on playing with version 2.1.


VERSION 2.2. Flat profile:

Each sample counts as 0.01 seconds. % cumulative self self total
time seconds seconds calls s/call s/call name
25.67 2.02 2.02 3003003 0.00 0.00 VariableScopeBegin 24.90 3.98 1.96 _mcount_private 23.13 5.80 1.82 3003003 0.00 0.00 VariableScopeEnd 3.81 6.10 0.30 45086681 0.00 0.00 LexGetRawToken 2.60 6.30 0.20 6015030 0.00 0.00 ExpressionParse 1.78 6.45 0.14 45086681 0.00 0.00 LexGetToken 1.78 6.59 0.14 12021050 0.00 0.00 ExpressionStackCollapse 1.59 6.71 0.13 36060297 0.00 0.00 HeapAllocStack 1.33 6.82 0.11 36057734 0.00 0.00 VariableAlloc 1.21 6.91 0.10 9012034 0.00 0.00 ParseStatement 1.14 7.00 0.09 3003016 0.00 0.00 ExpressionInfixOperator 1.14 7.09 0.09 TableSearch 0.89 7.16 0.07 36067717 0.00 0.00 ParserCopy



VERSION 2.1 Flat profile:

Each sample counts as 0.01 seconds. % cumulative self self total
time seconds seconds calls s/call s/call name
54.66 1.79 1.79 _mcount_private 6.41 2.00 0.21 48090152 0.00 0.00 LexGetRawToken 6.11 2.20 0.20 6015033 0.00 0.00 ExpressionParse 4.73 2.36 0.16 TableSearch 3.21 2.46 0.11 48090152 0.00 0.00 LexGetToken 3.05 2.56 0.10 9012048 0.00 0.00 ParseStatement 2.44 2.64 0.08 36057773 0.00 0.00 VariableAlloc 2.14 2.71 0.07 12021053 0.00 0.00 ExpressionStackCollapse 1.83 2.77 0.06 3003016 0.00 0.00 ExpressionInfixOperator 1.68 2.83 0.06 36068029 0.00 0.00 ParserCopy 1.37 2.87 0.04 _fentry__ 1.22 2.91 0.04 36060327 0.00 0.00 HeapAllocStack 1.22 2.95 0.04 15024063 0.00 0.00 ExpressionStackPushValueNode 1.22 2.99 0.04 9012022 0.00 0.00 ExpressionStackPushLValue 1.22 3.03 0.04 9012022 0.00 0.00 VariableAllocValueFromExistingData 1.22 3.07 0.04 3003004 0.00 0.00 ParserCopyPos 0.61 3.09 0.02 27042272 0.00 0.00 HeapPopStack