Open GoogleCodeExporter opened 9 years ago
At configuration time, any key column that doesn't belong to a known index is
added to the anonymous index
(type 'A', name "*Anonymous*").
The optimizer runs in child_init(). Its job is to look over all directory
configurations and to move key columns
from the anonymous index into real indexes.
Problem 1) how to get access to all the config::dir pointers in child_init()?
In config::init_dir(), after allocating
the structure, put a pointer to it in a static global array.
Problem 2) Apache parses the configuration twice. How do you know you've
really got one copy of each
config::dir in the global list?
Here's how the optimizer could work:
For each directory:
If there is an anonymous index:
Get the data dictionary from NDB.
Pass 1: get a list of indexes
We're going to create an IndexList - a list of real NDB indexes.
Loop over the key columns in the anonymous index.
For each key column, which real NDB indexes does it belong to?
Add each of these indexes to list.
Pass 2: narrow the list down to usable indexes
For each item in the IndexList:
If it's the Primary Key, and key columns exist for all parts, it's usable.
If it's a Unique Index, and key columns exist for all parts, it's usable.
If it's an Ordered Index, and key columns exist for a left prefix, it's usable.
Pass 3: assign key columns to usable indexes.
Start with the Primary Key. Assign key columns.
If there are no more unsassigned key columns, you're done.
Next, assign columns to the unique indexes.
If there are no more unsassigned key columns, you're done.
Next, assign columns to ordered indexes.
If there are no more unsassigned key columns, you're done.
All remaining unassigned columns become filters.
Original comment by john.david.duncan
on 11 Sep 2007 at 11:08
The optimizer can do something else important, too:
For each key column, it can store the NDB column number, the Column pointer,
and any other needed
information from the data dictionary in the key_columns array, so that they
don't have to be looked up at
runtime.
Original comment by john.david.duncan
on 11 Sep 2007 at 11:13
Following this design, it will ABSOLUTELY be necessary to restart Apache after
any ALTER TABLE.
Original comment by john.david.duncan
on 11 Sep 2007 at 11:15
Pass 2: narrow the list down to usable indexes
For each item in the IndexList:
If it's the Primary Key, and key columns exist for all parts, using the "equals" relop, it's usable.
If it's a Unique Index, and key columns exist for all parts, using the "equals" relop, it's usable.
If it's an Ordered Index, and key columns exist for a left prefix, using any relop, it's usable.
Original comment by john.david.duncan
on 11 Sep 2007 at 11:28
Flaw in the plan:
init_dir() is run while the configuration is parsed, but the directory is still
unmerged. Its inheritable attributes
may all be null. In order to get complete information, you need to get access
to a directory config structure
that has been merged.
However, merging of the config tree does not happen until runtime.
In fact, setting a breakpoint a merge_dir() reveals that directory merges can
happen 3 to 5 times per request.
Original comment by john.david.duncan
on 12 Sep 2007 at 4:31
It's possible to capture the path argument to init_dir() and store it into the
dir structure. This means I could
do my own merging in child_init().
You can't do this thoroughly or correctly at init-time (which is why apache
does it at runtime), but within
some restrictions, it might work. The restrictions include:
* ONLY use <Location> containers to configure mod_ndb. Do not use <Directory>
* Do not use vhosts.
* Do not attempt multiple cluster connections.
Original comment by john.david.duncan
on 12 Sep 2007 at 9:32
The optimizer should never be wrong.
Suppose you have three anonymous key columns: a, b, and c.
You also have ordered index idx1 on <a,b>, and ordered index idx2 on <a,c>.
You could do an index scan on idx1 and use c is a filter, or you could do an
index scan on idx2 and use b as a
filter. It's a tie. The optimizer cannot make a decision about this. The
NdbDictionary does not provide the
sort of cardinality statistics that other optimizers would use here.
I believe mod_ndb should require you to rewrite the query using a hint.
Original comment by john.david.duncan
on 12 Sep 2007 at 9:51
Original comment by john.david.duncan
on 4 Nov 2007 at 12:51
Original comment by john.david.duncan
on 28 Dec 2007 at 5:21
Original comment by john.david.duncan
on 28 Dec 2007 at 5:25
Original issue reported on code.google.com by
john.david.duncan
on 11 Sep 2007 at 10:52