Suggested performance optimizations

GoogleCodeExporter commented 9 years ago

I ran the code through a profiler.  Attached are a few simple targeted 
optimizations that deliver a modest but noticeable improvement in speed:

1. In _Custom.cs, memcmp is a custom implementation that does a length 
comparison, followed by a lexicographic comparison.  This implicitly calls 
get_chars, which uses quite a lot of time.  String.CompareTo is faster and 
produces the same semantics.  The other memcmp implementations depart further 
from standard usage -- but they are not called as much.

2. In vdbeaux_c.cs, aSize is reconstructed *every time* that 
sqlite3VdbeSerialTypeLen is being called.  Moving it out into a static field 
solves the problem.  I would've thought the optimizer would be smart enough to 
figure this out, but apparently not.

3. I have done the same thing to the constant arrays in keywordhash_h.cs.  In 
light of change #1, a string may not be the best representation.  Depends on 
whether or not it is usually accessed directly.

---

I haven't looked too far into this, but instead of using a parser that has been 
ported over line-by-line, it may be faster to to recreate parse_c.cs from 
parse.y, using a parser generator that can output C# directly.  This isn't a 
trivial change like the ones I've made, but it came up as a very fat target in 
the profiling.

Original issue reported on code.google.com by tanza...@gmail.com on 7 Mar 2011 at 1:40

Attachments:

CSharp-SQLite_rev8.patch

GoogleCodeExporter commented 9 years ago

#3 OK, added
#2 OK, added

#1 -- problem here is that the 3rd variable limits the length of the comparison

So the last lime would really need to be
      return A.Substring(0,Limit).CompareTo( B.Substring(0,Limit) );

I don't know if the Substring calls would add  additional overhead to make it 
not worth it.

Can you run it through the profile and see?

Original comment by noah.hart@gmail.com on 7 Mar 2011 at 6:34

Changed state: Started
Added labels: Type-Enhancement, Milestone-3.7.5
Removed labels: Type-Defect

GoogleCodeExporter commented 9 years ago

#1 test code
            string s = @"abcdefghijklmnopqrstuvwxyz";
            char c = '\0';
            int t = System.Environment.TickCount;
            for (int i = 0; i < 100000000; i++)
            {
                c = s[i % s.Length];
            }
            Console.WriteLine(c);
            Console.WriteLine(System.Environment.TickCount - t);

            char[] x = s.ToCharArray();
            t = System.Environment.TickCount;
            for (int i = 0; i < 100000000; i++)
            {
                c = x[i % s.Length];
            }
            Console.WriteLine(c);
            Console.WriteLine(System.Environment.TickCount - t);
Test results are the get_chars faster than Array

Original comment by ZGSXTY...@gmail.com on 7 Mar 2011 at 8:58

GoogleCodeExporter commented 9 years ago

Noah, good point -- I forgot about the prefix comparison semantics.  Substrings 
are expensive -- so probably best to leave it alone.

Incidentally, this particular overload of memcmp is called only four times.  
Two of them look like they could be replaced by more natural C# code.

-----

analyze_c.cs:
Code is checking for system tables.  Can be replaced with a String.StartsWith.
  184  if ( memcmp( pTab.zName, "sqlite_", 7 ) == 0 )

expr_c.cs:
This is an equality check.  It first calls memcmp using the length of z, then 
verifies that pE.u.zToken has the same length.  It's much less awkward just to 
call String.Equals.  The tricky thing is that Line 746 masks out the top 2 bits 
of the string length -- but this may not come up in practice.
  746  n = sqlite3Strlen30( z );
  751  if ( memcmp( pE.u.zToken, z, n ) == 0 && pE.u.zToken.Length == n )

Original comment by tanza...@gmail.com on 7 Mar 2011 at 5:40

GoogleCodeExporter commented 9 years ago

@ZGSXTY: I find that the two loops perform within 2% of each other (Release 
build, x64).  I don't think the loop will prove fruitful for optimization -- 
the loop in memcmp often early-exits when it finds a character mismatch.

Instead of optimizing the loop, I was hoping to replace it with a Framework 
call -- because Framework code is often more optimized than user code can be.  
But that doesn't preserve the semantics of the function.

Original comment by tanza...@gmail.com on 7 Mar 2011 at 5:55

GoogleCodeExporter commented 9 years ago

@tanza: I like the analyze & expr changes.  Let me make them and see if it will 
pass the testharness, then I can drop the overrides

Original comment by noah.hart@gmail.com on 7 Mar 2011 at 6:52

GoogleCodeExporter commented 9 years ago

@tanza:I just want to test get_chars and char[] between the efficiency 
difference. memcmp I remain of the view that Noah would like to know more 
consistent with the C SQLite

i'm sorry, Bing translation.

Original comment by ZGSXTY...@gmail.com on 8 Mar 2011 at 1:10

GoogleCodeExporter commented 9 years ago

@tanza:Comparison and sorting more complicated, involving Charset(Chinese), 
case, and cannot guarantee that the work of The. NET Framework Agreement with C 
SQLite

sorry.

Original comment by ZGSXTY...@gmail.com on 8 Mar 2011 at 1:25

GoogleCodeExporter commented 9 years ago

This issue was closed by revision 6c7451b683.

Original comment by noah.hart@gmail.com on 8 Mar 2011 at 6:04

Changed state: Fixed

tobiasschulz / csharp-sqlite

Suggested performance optimizations #94