Closed rtzui closed 9 years ago
Probably what's happening in this case is that Kona is reallocating character vectors at each step of the join-over. What it should do instead, and this is fairly easy to perform, is to identify that the intermediate vectors are going to be reclaimed and reuse them up to the length of allocation. This would yield an O(n) expansion method. If it's doing what I think it's doing, the current method is O(n^2). In that case I would agree with your disappointment about the performance.
See https://github.com/kevinlawler/kona/blob/master/src/vg.c#L341
Identifying vectors that will be reclaimed is possibly as easy as checking for reference count 1.
Another, probably better, way to do this is to build an optimized join-over verb for join and include it in the dispatch table.
I added some debuging output to join, and the reference count is sometimes 2 and sometimes 3. I haven't figured out how the dispach table works, i only found the struct that hold one row, but no element. Where are these added?
The main item is:
TR DT[] = //Dispatch table is append-only. ...
Slightly edited:
$ ack DT
src/k.c
76:I bk(V p){R (L)p==DT_END_OFFSET;} //break: is ; or \n
102:Z C verbsChar(V p) {R ((L)p>=DT_VERB_OFFSET && (L)p < DT_SPECIAL_VERB_OFFSET)?vc[((L)p-DT_VERB_OFFSET)/2]:'\0';}
104:Z C adverbsChar(V p){R ((L)p>=DT_ADVERB_OFFSET)?ac[((L)p-DT_ADVERB_OFFSET)%3]:'\0';}
110: if(q<DT_SIZE)R DT[q].arity;
117: if (q<DT_SIZE) R DT[q].adverbClass;
244: DO(a->n, CPMAX str=kS(a)[i]; if(str<(S)DT_SIZE)continue; sl=strlen(str);ss=simpleString(str);
256: if(q < DT_SIZE && q >= DT_SPECIAL_VERB_OFFSET)
257: { s=DT[q].text;
260: if(s[k-1]==':' && 1==DT[q].arity) O_("%c",':'); //extra colon for monadic 0: verbs
387:TR DT[] = //Dispatch table is append-only. Reorder/delete/insert breaks backward compatibility with IO & inet
518:L DT_SIZE=0;
519:L DT_END_OFFSET, DT_ADVERB_OFFSET, DT_VERB_OFFSET, DT_SPECIAL_VERB_OFFSET;
520:L DT_OFFSET(V v){I i=0; while(v!=DT[i].func)i++; R i;} //init only
src/k.h
111:L DT_OFFSET(V v);
124:extern L DT_SIZE, DT_END_OFFSET, DT_ADVERB_OFFSET, DT_VERB_OFFSET, DT_SPECIAL_VERB_OFFSET;
125:extern TR DT[];
src/kc.c
90: DT_SIZE = DT_OFFSET(TABLE_END);
91: DT_END_OFFSET = DT_OFFSET(end);
92: DT_ADVERB_OFFSET = DT_OFFSET(over);
93: DT_VERB_OFFSET = DT_OFFSET(flip);
94: DT_SPECIAL_VERB_OFFSET = DT_OFFSET(_0m);
Above the dispatch table are good templates such as K times_over(K x,K y)
on a maybe related topic - i've noticed different behaviour between kona / kdb3.2 and k2.8 on join each. as the output of kona and kdb mostly agree (additional round bracket), my guess is that this is actually an issue with k2.8:
/kona K Console - Enter \ for help ,'$!3 (,[,"0";];,[,"1";];,[,"2";])
/kdb KDB+ 3.2 2014.11.01 Copyright (C) 1993-2014 Kx Systems l32/ 2()core 3961MB theo debian-7v1 127.0.1.1 NONEXPIRE q)\ ,'$!3 ,'[(,"0";,"1";,"2")]
/k2.8 - a bug? K 2.8 2000-10-10 Copyright (C) 1993-2000 Kx Systems Evaluation. Not for commercial use. \ for help. \ to exit. ,'$!3 valence error ,'$!3 ^
Other strings to try:
,:'$!3
,/:$!3
,:/:$!3
summarizing all tests in a table including tom's results below
expr | kona | kdb3.2-l32 | k2.8-l32/k3.2-w32 |
---|---|---|---|
,:'$!3 | (,,"0" ,,"1" ,,"2") | (,,"0";,,"1";,,"2") | (,,"0" ,,"1" ,,"2") | ,/:$!3 | (,[,"0";];,[,"1";];,[,"2";]) | ,/:[(,"0";,"1";,"2")] | valence error ,/:$!3 ^ |
,:/:$!3 | (,,"0" ,,"1" ,,"2") | k){z+x*y} 'type * 0 ,: ) | valence error ,:/:$!3 ^ |
,'$!3 | (,[,"0";];,[,"1";];,[,"2";]) | ,'[(,"0";,"1";,"2")] | valence error ,'$!3 ^ |
So far I'm very disappointed about the performance. But probably I'm just writing inefficient code.
With haskell: main= writeFile "haskelloutput" $ foldr (++) "" [show x ++ "\n" ++ show y ++ "\n"| x <- [0..99], y <-[0..999]]
time runhaskell haskelltest.hs real 0m0.741s
with kona: \t `konaoutput 0: ,/$!100 1000 forever....