Open nebulon42 opened 7 years ago
About profiling
There is a simple built-in Node.js profiler (https://nodejs.org/en/docs/guides/simple-profiling/). Let's say we want to profile how carto processes osm-carto. We then run node --prof bin/carto -f test.xml path/to/openstreetmap-carto/project.mml
from latest master we get a isolate-HASH-v8.log
file in the working directory. We then have to post-process said file with node --prof-process isolate-HASH-v8.log > results.txt
.
Then I got the following:
[JavaScript]:
ticks total nonlib name
1023 6.2% 6.3% LazyCompile: *[Symbol.replace] native regexp.js:481:31
643 3.9% 3.9% LazyCompile: *tree.Filterset.addable path/to/carto/lib/carto/tree/filterset.js:105:44
324 2.0% 2.0% Stub: CEntryStub
305 1.8% 1.9% Builtin: CallFunction_ReceiverIsAny
289 1.7% 1.8% LazyCompile: *RegExpReplace native regexp.js:356:23
283 1.7% 1.7% Builtin: Call_ReceiverIsAny
265 1.6% 1.6% Stub: StringAddStub
240 1.5% 1.5% LazyCompile: *tree.Filterset.cloneWith path/to/carto/lib/carto/tree/filterset.js:59:46
231 1.4% 1.4% Stub: FastCloneRegExpStub
225 1.4% 1.4% LazyCompile: *replace native string.js:57:23
212 1.3% 1.3% Builtin: ToString
211 1.3% 1.3% KeyedLoadIC: A keyed load IC from the snapshot {2}
189 1.1% 1.2% Builtin: RegExpPrototypeExec
...
[C++]:
ticks total nonlib name
635 3.8% 3.9% v8::internal::StringReplaceGlobalRegExpWithString(v8::internal::Isolate*, v8::internal::Handle<v8::internal::String>, v8::internal::Handle<v8::internal::JSRegExp>, v8::internal::Handle<v8::internal::String>, v8::internal::Handle<v8::internal::JSObject>)
550 3.3% 3.4% v8::internal::Zone::DeleteAll()
531 3.2% 3.2% v8::internal::Runtime_StringReplaceGlobalRegExpWithString(int, v8::internal::Object**, v8::internal::Isolate*)
394 2.4% 2.4% v8::internal::String::GetFlatContent()
352 2.1% 2.2% v8::internal::Object* v8::internal::StringReplaceGlobalAtomRegExpWithString<v8::internal::SeqOneByteString>(v8::internal::Isolate*, v8::internal::Handle<v8::internal::String>, v8::internal::Handle<v8::internal::JSRegExp>, v8::internal::Handle<v8::internal::String>, v8::internal::Handle<v8::internal::JSObject>)
339 2.1% 2.1% v8::internal::CompiledReplacement::Compile(v8::internal::Handle<v8::internal::String>, int, int)
270 1.6% 1.7% v8::internal::FindStringIndicesDispatch(v8::internal::Isolate*, v8::internal::String*, v8::internal::String*, v8::internal::ZoneList<int>*, unsigned int, v8::internal::Zone*) [clone .constprop.128]
270 1.6% 1.7% memchr
241 1.5% 1.5% v8::internal::String::ComputeAndSetHash()
208 1.3% 1.3% void v8::internal::LookupIterator::Start<false>()
185 1.1% 1.1% write
184 1.1% 1.1% v8::internal::Zone::New(unsigned long)
183 1.1% 1.1% v8::internal::Runtime_KeyedGetProperty(int, v8::internal::Object**, v8::internal::Isolate*)
165 1.0% 1.0% v8::internal::StringTable::LookupString(v8::internal::Isolate*, v8::internal::Handle<v8::internal::String>)
162 1.0% 1.0% v8::internal::FindTwoByteStringIndices(v8::internal::Vector<unsigned short const>, unsigned short, v8::internal::ZoneList<int>*, unsigned int, v8::internal::Zone*)
...
[C++ entry points]:
ticks cpp total name
3674 44.1% 22.2% v8::internal::Runtime_StringReplaceGlobalRegExpWithString(int, v8::internal::Object**, v8::internal::Isolate*)
1607 19.3% 9.7% v8::internal::Runtime_KeyedGetProperty(int, v8::internal::Object**, v8::internal::Isolate*)
307 3.7% 1.9% v8::internal::Runtime_RegExpExec(int, v8::internal::Object**, v8::internal::Isolate*)
252 3.0% 1.5% v8::internal::Runtime_ForInEnumerate(int, v8::internal::Object**, v8::internal::Isolate*)
230 2.8% 1.4% v8::internal::Runtime_StringAdd(int, v8::internal::Object**, v8::internal::Isolate*)
194 2.3% 1.2% v8::internal::Runtime_StoreIC_Miss(int, v8::internal::Object**, v8::internal::Isolate*)
179 2.2% 1.1% v8::internal::Runtime_Equal(int, v8::internal::Object**, v8::internal::Isolate*)
179 2.2% 1.1% v8::internal::Runtime_CompileLazy(int, v8::internal::Object**, v8::internal::Isolate*)
169 2.0% 1.0% v8::internal::Builtin_HandleApiCall(int, v8::internal::Object**, v8::internal::Isolate*)
155 1.9% 0.9% v8::internal::Runtime_KeyedStoreIC_Miss(int, v8::internal::Object**, v8::internal::Isolate*)
139 1.7% 0.8% v8::internal::Runtime_CompileOptimized_Concurrent(int, v8::internal::Object**, v8::internal::Isolate*)
124 1.5% 0.8% v8::internal::Runtime_SetProperty(int, v8::internal::Object**, v8::internal::Isolate*)
110 1.3% 0.7% v8::internal::Builtin_FunctionConstructor(int, v8::internal::Object**, v8::internal::Isolate*)
...
[Bottom up (heavy) profile]:
Note: percentage shows a share of a particular caller in the total
amount of its parent calls.
Callers occupying less than 2.0% are not shown.
ticks parent name
1051 6.4% UNKNOWN
1023 6.2% LazyCompile: *[Symbol.replace] native regexp.js:481:31
1023 100.0% LazyCompile: *replace native string.js:57:23
987 96.5% LazyCompile: *tree.Filterset.addable path/to/carto/lib/carto/tree/filterset.js:105:44
987 100.0% LazyCompile: *tree.Filterset.cloneWith path/to/carto/lib/carto/tree/filterset.js:59:46
647 65.6% LazyCompile: *addRules path/to/carto/lib/carto/renderer.js:473:18
641 99.1% LazyCompile: *inheritDefinitions path/to/carto/lib/carto/renderer.js:537:28
336 34.0% LazyCompile: *foldStyle path/to/carto/lib/carto/renderer.js:611:19
336 100.0% LazyCompile: ~render path/to/carto/lib/carto/renderer.js:160:50
643 3.9% LazyCompile: *tree.Filterset.addable path/to/carto/lib/carto/tree/filterset.js:105:44
643 100.0% LazyCompile: *tree.Filterset.cloneWith path/to/carto/lib/carto/tree/filterset.js:59:46
439 68.3% LazyCompile: *addRules path/to/carto/lib/carto/renderer.js:473:18
426 97.0% LazyCompile: *inheritDefinitions path/to/carto/lib/carto/renderer.js:537:28
426 100.0% LazyCompile: ~render path/to/carto/lib/carto/renderer.js:160:50
426 100.0% LazyCompile: ~compile path/to/carto/bin/carto:80:17
13 3.0% LazyCompile: ~inheritDefinitions path/to/carto/lib/carto/renderer.js:537:28
13 100.0% LazyCompile: ~render path/to/carto/lib/carto/renderer.js:160:50
13 100.0% LazyCompile: ~compile path/to/carto/bin/carto:80:17
202 31.4% LazyCompile: *foldStyle path/to/carto/lib/carto/renderer.js:611:19
202 100.0% LazyCompile: ~render path/to/carto/lib/carto/renderer.js:160:50
202 100.0% LazyCompile: ~compile path/to/carto/bin/carto:80:17
202 100.0% LazyCompile: ~load path/to/carto/lib/carto/mml.js:19:41
...
(Note: The last time I did this which I didn't document I remember getting some 12% in Filterset.addable or the like)
I'm baffled. Befuddled. Bewildered.
Let's start with a recent profile:
[Summary]:
ticks total nonlib name
11097 60.1% 65.2% C++
5911 32.0% 34.7% JavaScript
1453 7.9% Shared libraries
163 0.9% 1.0% GC
7 0.0% Unaccounted
[C++ entry points]:
ticks cpp total name
3493 28.5% 18.9% v8::internal::Runtime_StringReplaceGlobalRegExpWithString(int, v8::internal::Object**, v8::internal::Isolate*)
2054 16.7% 11.1% v8::internal::Runtime_Apply(int, v8::internal::Object**, v8::internal::Isolate*)
1900 15.5% 10.3% v8::internal::Runtime_KeyedGetProperty(int, v8::internal::Object**, v8::internal::Isolate*)
892 7.3% 4.8% v8::internal::Runtime_BoundFunctionGetBindings(int, v8::internal::Object**, v8::internal::Isolate*)
[JavaScript]:
ticks total nonlib name
484 2.6% 2.8% LazyCompile: *replace native string.js:146:23
399 2.2% 2.3% Stub: CEntryStub
295 1.6% 1.7% Stub: StringAddStub_CheckNone_NotTenured
269 1.5% 1.6% LazyCompile: ~InnerArrayFilter native array.js:891:26
209 1.1% 1.2% LazyCompile: ~<anonymous> native v8natives.js:1199:16
205 1.1% 1.2% Builtin: ArgumentsAdaptorTrampoline
163 0.9% 1.0% LazyCompile: ~tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
[Shared libraries]:
ticks total nonlib name
1241 6.7% /usr/bin/nodejs
205 1.1% /lib/x86_64-linux-gnu/libc-2.24.so
3 0.0% /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.24
2 0.0% [vdso]
2 0.0% /lib/x86_64-linux-gnu/libm-2.24.so
[Bottom up (heavy) profile]:
Note: percentage shows a share of a particular caller in the total
amount of its parent calls.
Callers occupying less than 2.0% are not shown.
ticks parent name
1241 6.7% /usr/bin/nodejs
537 43.3% v8::internal::Runtime_Apply(int, v8::internal::Object**, v8::internal::Isolate*)
534 99.4% LazyCompile: ~<anonymous> native v8natives.js:1199:16
534 100.0% LazyCompile: ~InnerArrayFilter native array.js:891:26
534 100.0% LazyCompile: *filter native array.js:915:21
464 86.9% LazyCompile: *tree.Variable.ev /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/variable.js:16:17
55 10.3% LazyCompile: *tree.Operation.ev /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/operation.js:16:39
15 2.8% LazyCompile: ~tree.Variable.ev /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/variable.js:16:17
355 28.6% v8::internal::Runtime_StringReplaceGlobalRegExpWithString(int, v8::internal::Object**, v8::internal::Isolate*)
355 100.0% LazyCompile: *replace native string.js:146:23
55 15.5% LazyCompile: *tree.Filterset.addable /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:105:44
55 100.0% LazyCompile: *tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
41 74.5% LazyCompile: *addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
12 21.8% LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
2 3.6% LazyCompile: ~foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
36 10.1% Handler: 67092480#__default__#line-width {5}
31 86.1% LazyCompile: ~tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
22 71.0% LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
9 29.0% LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
5 13.9% LazyCompile: *tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
3 60.0% LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
2 40.0% LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
36 10.1% Handler: 67043328#__default__#line-color
35 97.2% LazyCompile: ~tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
24 68.6% LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
11 31.4% LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
1 2.8% LazyCompile: *tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
1 100.0% LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
32 9.0% Handler: 67107840#__default__#line-color
28 87.5% LazyCompile: ~tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
18 64.3% LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
10 35.7% LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
4 12.5% LazyCompile: *tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
2 50.0% LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
2 50.0% LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
30 8.5% Handler: 67107840#__default__#line-width
25 83.3% LazyCompile: ~tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
15 60.0% LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
10 40.0% LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
5 16.7% LazyCompile: *tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
5 100.0% LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
29 8.2% Handler: 67107840#__default__#line-color {1}
25 86.2% LazyCompile: ~tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
19 76.0% LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
6 24.0% LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
4 13.8% LazyCompile: *tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
3 75.0% LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
1 25.0% LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
28 7.9% Handler: 67107840#__default__#line-cap
25 89.3% LazyCompile: ~tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
17 68.0% LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
8 32.0% LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
3 10.7% LazyCompile: *tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
2 66.7% LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
1 33.3% LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
24 6.8% Handler: 67076096#access#line-dasharray
20 83.3% LazyCompile: ~tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
10 50.0% LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
10 50.0% LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
4 16.7% LazyCompile: *tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
4 100.0% LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
23 6.5% Handler: 67106816#__default__#line-width
19 82.6% LazyCompile: ~tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
10 52.6% LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
9 47.4% LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
4 17.4% LazyCompile: *tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
3 75.0% LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
1 25.0% LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
22 6.2% Handler: 67092480#__default__#line-color
17 77.3% LazyCompile: ~tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
12 70.6% LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
5 29.4% LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
5 22.7% LazyCompile: *tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
3 60.0% LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
2 40.0% LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
21 5.9% Handler: 67076096#access#line-color {1}
18 85.7% LazyCompile: ~tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
9 50.0% LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
9 50.0% LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
3 14.3% LazyCompile: *tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
2 66.7% LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
1 33.3% LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
12 3.4% LazyCompile: *tree.Quoted.toString /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/quoted.js:10:23
7 58.3% LazyCompile: *DefaultString native runtime.js:657:23
7 100.0% LazyCompile: *ConvertToString native array.js:159:25
2 16.7% LazyCompile: ~<anonymous> /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/expression.js:23:48
2 100.0% LazyCompile: *InnerArrayMap native array.js:1001:23
2 16.7% LazyCompile: *DefaultNumber native runtime.js:642:23
2 100.0% LazyCompile: *ToPrimitive native runtime.js:528:21
1 8.3% LazyCompile: ~tree.Filter.toObject /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filter.js:32:42
1 100.0% LazyCompile: ~tree.Filterset.toObject /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:8:45
109 8.8% v8::internal::Runtime_KeyedGetProperty(int, v8::internal::Object**, v8::internal::Isolate*)
70 64.2% Handler: 67076096#access#line-color
64 91.4% LazyCompile: ~tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
44 68.8% LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
44 100.0% LazyCompile: *inheritDefinitions /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:537:28
20 31.3% LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
20 100.0% LazyCompile: render /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:160:50
6 8.6% LazyCompile: *tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
6 100.0% LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
6 100.0% LazyCompile: *inheritDefinitions /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:537:28
23 21.1% Handler: 67076096#access#line-join
21 91.3% LazyCompile: ~tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
13 61.9% LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
13 100.0% LazyCompile: *inheritDefinitions /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:537:28
8 38.1% LazyCompile: *foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
8 100.0% LazyCompile: render /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:160:50
2 8.7% LazyCompile: *tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
2 100.0% LazyCompile: ~addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
2 100.0% LazyCompile: *inheritDefinitions /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:537:28
11 10.1% LazyCompile: *tree.Filterset.addable /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:105:44
11 100.0% LazyCompile: *tree.Filterset.cloneWith /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/filterset.js:59:46
9 81.8% LazyCompile: *addRules /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:473:18
6 66.7% LazyCompile: *inheritDefinitions /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:537:28
3 33.3% LazyCompile: ~inheritDefinitions /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:537:28
1 9.1% LazyCompile: ~foldStyle /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:611:19
1 100.0% LazyCompile: render /var/lib/data/mdione/src/system/osm/carto/lib/carto/renderer.js:160:50
1 9.1% LazyCompile: *tree.Ruleset.flatten /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/ruleset.js:94:22
1 100.0% LazyCompile: *tree.Ruleset.flatten /var/lib/data/mdione/src/system/osm/carto/lib/carto/tree/ruleset.js:94:22
So the plan was to attack the calls to v8::internal::Runtime_StringReplaceGlobalRegExpWithString()
. From the bottom up, I understand that most of these calls happen in lib/carto/tree/filterset.js:105:44
, tree.Filterset.addable()
. Notice that the line number, 105, is the line of where the definition of addable()
starts. I can't find any reference about what the 44 means.
But there seem to be no such call in that function; the closest I could find is the regexp in line 110, so I thought: maybe that regexp is taking too long. I replaced the code like this:
/*
if (value.match(/^[+-]?[0-9]+(\.[0-9]*)?([e|E][+-]?[0-9]+)?$/)) {
value = parseFloat(value);
}
*/
f = parseFloat(value);
if (f !== NaN) {
value = f;
}
I understand that the code is not equivalent; in particular, if value is like 123abc
, with the original code it would retain its string type and original value, while with the new one it would become the float 123.
Guess what: instead of ~18s in my system, now it takes ~4m30s!
I tried to find other, non-sampling profilers to fully profile the script and I can't find any. The closest I got was https://github.com/ralphv/zoran, but it's a 5 commit tool wrote in two days, one in 2015 and another in 2016, and I couldn't make it work with a CLI tool like carto
.
As you can see, I'm not a JS developer, so I'm not familiar with its toolchain. I you could provide some pointers on how to actually tackle this, I'll be happy to follow them.
I also wanted to get rid of the regex already but that resulted in the weird behaviour you noticed which especially breaks OSM Bright.
I think the problem here is not the regex itself but that the system spends so much time in addable that it is the most important calculation. The filter system of carto is quite complex so I don't understand it fully yet. There is some experimental branch which tries to modify the filter system and also removes the particular function, see #206.
So far I didn't have the need for performance optimizations in JS, so I'm also quite new to the topic. At this point I haven't any specific advice but maybe @tmcw could share some thoughts how he would approach this.
Well, one thing I learned is that regexps literals are automatically compiled by v8
, and probably has a JIT for that too. I also tried caching the results of several toString()
and toObject()
methods, but nothing made it better and in some cases made it worse, so I also think the JIT is better than us.
I found new ways to profile with chromium
(yes, the best JS tool for profiling is in a browser, but it kinda makes sense because of the genesis of the language) and running carto
with a recent version of node
and its --inspect
option. I hope to get some news by the weekend.
More info:
time ./bin/carto -a 3.0.6 -p=true -b ~/src/projects/osm/osm-carto/project.mml 2>&1 | awk '/^processing/ { layer=$3 }; /^Inheritance/ { print $3, layer }' | sort -n
[...]
113ms amenity-points
120ms amenity-points-poly
292ms landcover
792ms roads-fill
803ms tunnels
967ms bridges
2646ms text-point
2684ms text-poly
real 0m17.062s
user 0m17.513s
sys 0m0.187s
So chomium
/chrome
's profiler is as bad as the command line one; that is, it provides no new information, just the same thing graphed in three ways: the bottom up, the top down (slightly more useful) and a flamegraph (ditto).
The latter at least tells us that we're dying of a gazillion of paper cuts:
Zoom to one of the more expensive inheritedDefinitions()
/ foldStyle()
pair:
Zoom to a couple of addRules()
and one cloneWith()
:
So the conclusion so far is: if we think we can gain some processing time, it will have to be by rethinking those 4 functions.
Thanks for the time you invested so far! About functions: it was my feeling too that it boils down to those. So we have found that out by research too. The question remains what needs to be done to gain something.
I have no knowledge about carto but js optimisation. You loose lot of time in the matching. Could this code be changed:
if (value.match(/^[+-]?[0-9]+(\.[0-9]*)?([e|E][+-]?[0-9]+)?$/)) {
value = parseFloat(value);
}
to
let tempvalue = parseFloat(value);
if (!isNaN(tempvalue)) {
value = tempvalue;
}
@StyXman suggested something like that but forgot that you can not compare NaN with NaN in JS!
NaN !== NaN
true
10 !== NaN
true
Thats why StyXman code dropped the condition and worked with the parseFloat´ed value in any case. No real wonder that carto exploded as it had to handle NaN instead of a string or number.
@HolgerJeromin How would you prevent something like that:
var tempvalue = parseFloat('1layer');
if (!isNaN(tempvalue)) {
console.log('a number');
}
Ref #471 (shows a real world case for that problem)
Backed with jsPerf i would do similar code as jQuery has (explanation):
let value = '1layer';
let tempvalue = parseFloat(value);
if (isFinite( tempvalue - value )) {
value = tempvalue;
}
I hope this does not kill performance again :-) I could do a blind PR if you like.
When I used an unguarded parseFloat()
, times went from 30s to 4m30s. Can you do the change and test locally? I should probably see if I should upgrade my node version.
@StyXman Did you read this comment?
Thats why StyXman code dropped the condition and worked with the parseFloat´ed value in any case. No real wonder that carto exploded as it had to handle NaN instead of a string or number.
I did it like this
var key = filter.key.toString(),
value = filter.val.toString(),
valueIsNull = filter.val.is == 'keyword' && value == 'null',
valueIsNumber = parseFloat(value);
if (isFinite(value - valueIsNumber)) {
value = valueIsNumber;
}
Unfortunately, I didn't see any gains. The new code is a little bit slower (~100ms). :) But nice suggestion for the parseFloat workaround, thanks.
Are there any work in progress with this issue?
I'd suspect the biggest gains lie in big-o/algorithmic optimization instead of JS microbenchmarking and the like. One thing I'd really think about is whether, instead of treating parsing as "generating an AST and then mutating it until it's the stylesheet", it could be treated as a full transform: the unprocessed input and processed output would be entirely different data representations and there could be an intermediate form that uses JavaScript's blessed new datastructures (sets and maps, etc) to optimize lookup and modification.
Sorry that I dissapeared, my family just expanded. @HolgerJeromin's change doesn't seem to make a change, I still get ~17s for the same version of osm-carto.
I'd suspect the biggest gains lie in big-o/algorithmic optimization instead of JS microbenchmarking and the like.
I agree - there don't seem to be obvious big gains in the inner loops. #206 was mentioned above, and that would eliminate the hot function completely, and improve the resulting XML.
This issue is meant to collect discussions about possible performance optimizations. More detailed sub tasks should be opened once identified.