Closed karantin2020 closed 8 years ago
Last commit in routing example was by the way :)
Checked benchmarks. With my commits str_dispatch and str_match_entry execute faster. Speed of pcre_dispatch didn't change. Valgrind output of routing example check:
$ valgrind --leak-check=yes --track-origins=yes ./test
==14670== Memcheck, a memory error detector
==14670== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==14670== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==14670== Command: ./test
==14670==
Slug name is: post
Slug value is: aaa
Slug name is: id
Slug value is: 123
==14670==
==14670== HEAP SUMMARY:
==14670== in use at exit: 0 bytes in 0 blocks
==14670== total heap usage: 68 allocs, 68 frees, 7,848 bytes allocated
==14670==
==14670== All heap blocks were freed -- no leaks are possible
Without that PR:
$ valgrind --leak-check=yes --track-origins=yes ./simple
==26656== Memcheck, a memory error detector
==26656== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==26656== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==26656== Command: ./simple
==26656==
==26656== Conditional jump or move depends on uninitialised value(s)
==26656== at 0x4019F9: r3_tree_matchl (node.c:296)
==26656== by 0x40139C: main (simple.c:49)
==26656== Uninitialised value was created by a heap allocation
==26656== at 0x4C2BBCF: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26656== by 0x403512: zmalloc (zmalloc.c:123)
==26656== by 0x4014B1: r3_tree_create (node.c:56)
==26656== by 0x400FB6: main (simple.c:13)
==26656==
==26656== Conditional jump or move depends on uninitialised value(s)
==26656== at 0x401A8E: r3_tree_matchl (node.c:296)
==26656== by 0x40139C: main (simple.c:49)
==26656== Uninitialised value was created by a heap allocation
==26656== at 0x4C2BBCF: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26656== by 0x403512: zmalloc (zmalloc.c:123)
==26656== by 0x4014B1: r3_tree_create (node.c:56)
==26656== by 0x403A37: r3_edge_branch (edge.c:71)
==26656== by 0x40261A: r3_tree_insert_pathl_ex (node.c:765)
==26656== by 0x401056: main (simple.c:18)
==26656==
==26656== Conditional jump or move depends on uninitialised value(s)
==26656== at 0x4019F9: r3_tree_matchl (node.c:296)
==26656== by 0x4013D2: main (simple.c:52)
==26656== Uninitialised value was created by a heap allocation
==26656== at 0x4C2BBCF: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26656== by 0x403512: zmalloc (zmalloc.c:123)
==26656== by 0x4014B1: r3_tree_create (node.c:56)
==26656== by 0x400FB6: main (simple.c:13)
==26656==
==26656== Conditional jump or move depends on uninitialised value(s)
==26656== at 0x401A8E: r3_tree_matchl (node.c:296)
==26656== by 0x4013D2: main (simple.c:52)
==26656== Uninitialised value was created by a heap allocation
==26656== at 0x4C2BBCF: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26656== by 0x403512: zmalloc (zmalloc.c:123)
==26656== by 0x4014B1: r3_tree_create (node.c:56)
==26656== by 0x403A37: r3_edge_branch (edge.c:71)
==26656== by 0x40261A: r3_tree_insert_pathl_ex (node.c:765)
==26656== by 0x401056: main (simple.c:18)
==26656==
Matched! /garply/baz/grault
==26656==
==26656== HEAP SUMMARY:
==26656== in use at exit: 15,066 bytes in 131 blocks
==26656== total heap usage: 220 allocs, 89 frees, 20,468 bytes allocated
==26656==
==26656== 72 bytes in 1 blocks are definitely lost in loss record 39 of 131
==26656== at 0x4C2BBCF: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26656== by 0x403512: zmalloc (zmalloc.c:123)
==26656== by 0x4039CC: r3_edge_createl (edge.c:39)
==26656== by 0x403A56: r3_edge_branch (edge.c:72)
==26656== by 0x40261A: r3_tree_insert_pathl_ex (node.c:765)
==26656== by 0x401008: main (simple.c:16)
==26656==
==26656== 72 bytes in 1 blocks are definitely lost in loss record 40 of 131
==26656== at 0x4C2BBCF: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26656== by 0x403512: zmalloc (zmalloc.c:123)
==26656== by 0x4039CC: r3_edge_createl (edge.c:39)
==26656== by 0x403A56: r3_edge_branch (edge.c:72)
==26656== by 0x40261A: r3_tree_insert_pathl_ex (node.c:765)
==26656== by 0x402635: r3_tree_insert_pathl_ex (node.c:766)
==26656== by 0x40102F: main (simple.c:17)
==26656==
==26656== 72 bytes in 1 blocks are definitely lost in loss record 41 of 131
==26656== at 0x4C2BBCF: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26656== by 0x403512: zmalloc (zmalloc.c:123)
==26656== by 0x4039CC: r3_edge_createl (edge.c:39)
==26656== by 0x403A56: r3_edge_branch (edge.c:72)
==26656== by 0x40261A: r3_tree_insert_pathl_ex (node.c:765)
==26656== by 0x401056: main (simple.c:18)
==26656==
==26656== 72 bytes in 1 blocks are definitely lost in loss record 42 of 131
==26656== at 0x4C2BBCF: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26656== by 0x403512: zmalloc (zmalloc.c:123)
==26656== by 0x4039CC: r3_edge_createl (edge.c:39)
==26656== by 0x403A56: r3_edge_branch (edge.c:72)
==26656== by 0x40261A: r3_tree_insert_pathl_ex (node.c:765)
==26656== by 0x402635: r3_tree_insert_pathl_ex (node.c:766)
==26656== by 0x40107D: main (simple.c:19)
==26656==
==26656== 72 bytes in 1 blocks are definitely lost in loss record 43 of 131
==26656== at 0x4C2BBCF: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26656== by 0x403512: zmalloc (zmalloc.c:123)
==26656== by 0x4039CC: r3_edge_createl (edge.c:39)
==26656== by 0x403A56: r3_edge_branch (edge.c:72)
==26656== by 0x40261A: r3_tree_insert_pathl_ex (node.c:765)
==26656== by 0x402635: r3_tree_insert_pathl_ex (node.c:766)
==26656== by 0x4010A4: main (simple.c:20)
==26656==
==26656== 72 bytes in 1 blocks are definitely lost in loss record 44 of 131
==26656== at 0x4C2BBCF: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26656== by 0x403512: zmalloc (zmalloc.c:123)
==26656== by 0x4039CC: r3_edge_createl (edge.c:39)
==26656== by 0x403A56: r3_edge_branch (edge.c:72)
==26656== by 0x40261A: r3_tree_insert_pathl_ex (node.c:765)
==26656== by 0x402635: r3_tree_insert_pathl_ex (node.c:766)
==26656== by 0x4010CB: main (simple.c:21)
==26656==
==26656== 72 bytes in 1 blocks are definitely lost in loss record 45 of 131
==26656== at 0x4C2BBCF: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26656== by 0x403512: zmalloc (zmalloc.c:123)
==26656== by 0x4039CC: r3_edge_createl (edge.c:39)
==26656== by 0x403A56: r3_edge_branch (edge.c:72)
==26656== by 0x40261A: r3_tree_insert_pathl_ex (node.c:765)
==26656== by 0x402635: r3_tree_insert_pathl_ex (node.c:766)
==26656== by 0x402635: r3_tree_insert_pathl_ex (node.c:766)
==26656== by 0x4010F2: main (simple.c:22)
==26656==
==26656== 72 bytes in 1 blocks are definitely lost in loss record 46 of 131
==26656== at 0x4C2BBCF: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26656== by 0x403512: zmalloc (zmalloc.c:123)
==26656== by 0x4039CC: r3_edge_createl (edge.c:39)
==26656== by 0x403A56: r3_edge_branch (edge.c:72)
==26656== by 0x40261A: r3_tree_insert_pathl_ex (node.c:765)
==26656== by 0x402635: r3_tree_insert_pathl_ex (node.c:766)
==26656== by 0x402635: r3_tree_insert_pathl_ex (node.c:766)
==26656== by 0x402635: r3_tree_insert_pathl_ex (node.c:766)
==26656== by 0x401119: main (simple.c:23)
==26656==
==26656== 72 bytes in 1 blocks are definitely lost in loss record 47 of 131
==26656== at 0x4C2BBCF: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26656== by 0x403512: zmalloc (zmalloc.c:123)
==26656== by 0x4039CC: r3_edge_createl (edge.c:39)
==26656== by 0x403A56: r3_edge_branch (edge.c:72)
==26656== by 0x40261A: r3_tree_insert_pathl_ex (node.c:765)
==26656== by 0x402635: r3_tree_insert_pathl_ex (node.c:766)
==26656== by 0x401167: main (simple.c:25)
==26656==
==26656== 72 bytes in 1 blocks are definitely lost in loss record 48 of 131
==26656== at 0x4C2BBCF: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26656== by 0x403512: zmalloc (zmalloc.c:123)
==26656== by 0x4039CC: r3_edge_createl (edge.c:39)
==26656== by 0x403A56: r3_edge_branch (edge.c:72)
==26656== by 0x40261A: r3_tree_insert_pathl_ex (node.c:765)
==26656== by 0x402635: r3_tree_insert_pathl_ex (node.c:766)
==26656== by 0x4011DC: main (simple.c:28)
==26656==
==26656== 72 bytes in 1 blocks are definitely lost in loss record 49 of 131
==26656== at 0x4C2BBCF: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26656== by 0x403512: zmalloc (zmalloc.c:123)
==26656== by 0x4039CC: r3_edge_createl (edge.c:39)
==26656== by 0x403A56: r3_edge_branch (edge.c:72)
==26656== by 0x40261A: r3_tree_insert_pathl_ex (node.c:765)
==26656== by 0x402635: r3_tree_insert_pathl_ex (node.c:766)
==26656== by 0x402635: r3_tree_insert_pathl_ex (node.c:766)
==26656== by 0x401203: main (simple.c:29)
==26656==
==26656== 72 bytes in 1 blocks are definitely lost in loss record 50 of 131
==26656== at 0x4C2BBCF: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26656== by 0x403512: zmalloc (zmalloc.c:123)
==26656== by 0x4039CC: r3_edge_createl (edge.c:39)
==26656== by 0x403A56: r3_edge_branch (edge.c:72)
==26656== by 0x40261A: r3_tree_insert_pathl_ex (node.c:765)
==26656== by 0x402635: r3_tree_insert_pathl_ex (node.c:766)
==26656== by 0x40129F: main (simple.c:33)
==26656==
==26656== 72 bytes in 1 blocks are definitely lost in loss record 51 of 131
==26656== at 0x4C2BBCF: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26656== by 0x403512: zmalloc (zmalloc.c:123)
==26656== by 0x4039CC: r3_edge_createl (edge.c:39)
==26656== by 0x403A56: r3_edge_branch (edge.c:72)
==26656== by 0x40261A: r3_tree_insert_pathl_ex (node.c:765)
==26656== by 0x402635: r3_tree_insert_pathl_ex (node.c:766)
==26656== by 0x4012C6: main (simple.c:34)
==26656==
==26656== 72 bytes in 1 blocks are definitely lost in loss record 52 of 131
==26656== at 0x4C2BBCF: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26656== by 0x403512: zmalloc (zmalloc.c:123)
==26656== by 0x4039CC: r3_edge_createl (edge.c:39)
==26656== by 0x403A56: r3_edge_branch (edge.c:72)
==26656== by 0x40261A: r3_tree_insert_pathl_ex (node.c:765)
==26656== by 0x402635: r3_tree_insert_pathl_ex (node.c:766)
==26656== by 0x402635: r3_tree_insert_pathl_ex (node.c:766)
==26656== by 0x4012ED: main (simple.c:35)
==26656==
==26656== 72 bytes in 1 blocks are definitely lost in loss record 53 of 131
==26656== at 0x4C2BBCF: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26656== by 0x403512: zmalloc (zmalloc.c:123)
==26656== by 0x4039CC: r3_edge_createl (edge.c:39)
==26656== by 0x403A56: r3_edge_branch (edge.c:72)
==26656== by 0x40261A: r3_tree_insert_pathl_ex (node.c:765)
==26656== by 0x402635: r3_tree_insert_pathl_ex (node.c:766)
==26656== by 0x402635: r3_tree_insert_pathl_ex (node.c:766)
==26656== by 0x40133B: main (simple.c:37)
==26656==
==26656== 13,986 (136 direct, 13,850 indirect) bytes in 1 blocks are definitely lost in loss record 131 of 131
==26656== at 0x4C2BBCF: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26656== by 0x403512: zmalloc (zmalloc.c:123)
==26656== by 0x4014B1: r3_tree_create (node.c:56)
==26656== by 0x400FB6: main (simple.c:13)
==26656==
==26656== LEAK SUMMARY:
==26656== definitely lost: 1,216 bytes in 16 blocks
==26656== indirectly lost: 13,850 bytes in 115 blocks
==26656== possibly lost: 0 bytes in 0 blocks
==26656== still reachable: 0 bytes in 0 blocks
==26656== suppressed: 0 bytes in 0 blocks
==26656==
==26656== For counts of detected and suppressed errors, rerun with: -v
==26656== ERROR SUMMARY: 23 errors from 20 contexts (suppressed: 0 from 0)
Added change to simple example. Now valgrind output is:
$ valgrind --leak-check=yes --track-origins=yes ./simple
==4528== Memcheck, a memory error detector
==4528== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==4528== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==4528== Command: ./simple
==4528==
Matched! /garply/baz/grault
==4528==
==4528== HEAP SUMMARY:
==4528== in use at exit: 0 bytes in 0 blocks
==4528== total heap usage: 220 allocs, 220 frees, 20,484 bytes allocated
==4528==
==4528== All heap blocks were freed -- no leaks are possible
==4528==
==4528== For counts of detected and suppressed errors, rerun with: -v
==4528== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Everything is well.
:+1:
Hi @karantin2020,
I found that 'make check' was not running on travis-ci, so I fixed the autoconf script and found the check tests are not passed
it's now building on travis ci https://travis-ci.org/c9s/r3/builds/115478718
I moved your changed to v2/slug-parser, so we can work on the changes there
You will need to reset your 2.0 branch because some commits were rebased:
git reset --hard c609003c95aea649a2c39405ec0c54b495836fc9
Sorry, looks like it's working on Travis CI. https://travis-ci.org/c9s/r3/builds/115478813
hmmm I have to take a deeper look on my side
It's now working. I tested your changes, the benchmark falls back to 12M~13M from the original 15M matches per second. (the last 4 records)
I do not know the reason of performance regression. The only thing that changes do in pattern match is
entry->vars->slugs = n->routes[i]->slugs;
on https://github.com/c9s/r3/pull/95/files#diff-ff01266e09631f593cc7a6eebde89eafR453)`
It's only copying of pointer even not the value of slugs array.
May be something else changed...
Hi karantin,
The string pattern matching benchmark doesn't include r3route functions and match entry, it just tests the dispatching speed with plain string.
I guess it is caused by the changed edges field in the node struct.
@karantin2020 notifications@github.com 於 2016年3月12日 星期六寫道:
I do not know the reason of performance regression. The only thing that changes do in pattern match is entry->vars->slugs = n->routes[i]->slugs; on https://github.com/c9s/r3/pull/95/files#diff-ff01266e09631f593cc7a6eebde89eafR453)` It's only copying of pointer even not the value of slugs array.
— Reply to this email directly or view it on GitHub https://github.com/c9s/r3/pull/95#issuecomment-195716032.
))) I changes comment. Previous was wrong ) I wrote: Oh. Need more tests. What are the routes that have the most speed regression? May be they need more work?
You could take a look at tests/bench.c
I guess it takes longer time to traverse the edges. We shall let CPU caches the instructions as much as possible.
I will see how to restore the performance
New slug parse. Removed memory leaks.
Passed all tests. Need check for performance regression