FirebirdSQL / firebird

Firebird server, client and tools
https://www.firebirdsql.org/
1.23k stars 213 forks source link

Wrong plan selected in simple join [CORE3707] #4055

Open firebird-automations opened 12 years ago

firebird-automations commented 12 years ago

Submitted by: Jesus Angel Garcia Zarco (cointec)

Relate to CORE4666

Votes: 2

I have created a simple database with three tables, extracted the model for one of our production database. The problem with our production database is that the bad plan makes one query to run in a big database so much time.

CREATE DATABASE 'C:\BD\Firebird25\Prueba.fdb' USER 'SYSDBA' PASSWORD 'masterkey' PAGE_SIZE 8192 DEFAULT CHARACTER SET NONE COLLATION NONE;

CREATE TABLE TABLE0 ( IDPETICION INTEGER NOT NULL );

CREATE TABLE TABLE1 ( IDPETICIONPRUEBA INTEGER NOT NULL, IDPETICION INTEGER );

CREATE TABLE TABLE2 ( IDPETICIONPRUEBA INTEGER );

ALTER TABLE TABLE0 ADD CONSTRAINT PK_TABLE0 PRIMARY KEY (IDPETICION); ALTER TABLE TABLE1 ADD CONSTRAINT PK_TABLE1 PRIMARY KEY (IDPETICIONPRUEBA); ALTER TABLE TABLE1 ADD CONSTRAINT FK_TABLE1_1 FOREIGN KEY (IDPETICION) REFERENCES TABLE0 (IDPETICION); ALTER TABLE TABLE2 ADD CONSTRAINT FK_TABLE2_1 FOREIGN KEY (IDPETICIONPRUEBA) REFERENCES TABLE1 (IDPETICIONPRUEBA);

if i run

select t1.IDPETICIONPRUEBA from table1 t1 join table2 t2 on (t1.IDPETICIONPRUEBA = t2.IDPETICIONPRUEBA ) where t1.IDPETICION = 1

the plan is PLAN JOIN (T2 NATURAL, T1 INDEX (PK_TABLE1))

With this model of tables, firebird allways select the bad plan.

firebird-automations commented 12 years ago
Modified by: Jesus Angel Garcia Zarco (cointec) description: I have created a simple database with three tables, extracted the model for one of our production database\. The problem with our production database is that the bad plan makes one query to run in a big database so much time\. CREATE DATABASE 'localhost/3054:C:\\BD\\Firebird25\\Prueba\.fdb' USER 'SYSDBA' PASSWORD 'masterkey' PAGE\_SIZE 8192 DEFAULT CHARACTER SET NONE COLLATION NONE; CREATE TABLE TABLE0 \( IDPETICION INTEGER NOT NULL \); CREATE TABLE TABLE1 \( IDPETICIONPRUEBA INTEGER NOT NULL, IDPETICION INTEGER \); CREATE TABLE TABLE2 \( IDPETICIONPRUEBA INTEGER \); ALTER TABLE TABLE0 ADD CONSTRAINT PK\_TABLE0 PRIMARY KEY \(IDPETICION\); ALTER TABLE TABLE1 ADD CONSTRAINT PK\_TABLE1 PRIMARY KEY \(IDPETICIONPRUEBA\); ALTER TABLE TABLE1 ADD CONSTRAINT FK\_TABLE1\_1 FOREIGN KEY \(IDPETICION\) REFERENCES TABLE0 \(IDPETICION\); ALTER TABLE TABLE2 ADD CONSTRAINT FK\_TABLE2\_1 FOREIGN KEY \(IDPETICIONPRUEBA\) REFERENCES TABLE1 \(IDPETICIONPRUEBA\); if i run select t1\.IDPETICIONPRUEBA from table1 t1 join table2 t2 on \(t1\.IDPETICIONPRUEBA = t2\.IDPETICIONPRUEBA \) where t1\.IDPETICION = 1 the plan is PLAN JOIN \(T2 NATURAL, T1 INDEX \(PK\_TABLE1\)\) With this model of tables, firebird allways select the bad plan\. =\> I have created a simple database with three tables, extracted the model for one of our production database\. The problem with our production database is that the bad plan makes one query to run in a big database so much time\. CREATE DATABASE 'C:\\BD\\Firebird25\\Prueba\.fdb' USER 'SYSDBA' PASSWORD 'masterkey' PAGE\_SIZE 8192 DEFAULT CHARACTER SET NONE COLLATION NONE; CREATE TABLE TABLE0 \( IDPETICION INTEGER NOT NULL \); CREATE TABLE TABLE1 \( IDPETICIONPRUEBA INTEGER NOT NULL, IDPETICION INTEGER \); CREATE TABLE TABLE2 \( IDPETICIONPRUEBA INTEGER \); ALTER TABLE TABLE0 ADD CONSTRAINT PK\_TABLE0 PRIMARY KEY \(IDPETICION\); ALTER TABLE TABLE1 ADD CONSTRAINT PK\_TABLE1 PRIMARY KEY \(IDPETICIONPRUEBA\); ALTER TABLE TABLE1 ADD CONSTRAINT FK\_TABLE1\_1 FOREIGN KEY \(IDPETICION\) REFERENCES TABLE0 \(IDPETICION\); ALTER TABLE TABLE2 ADD CONSTRAINT FK\_TABLE2\_1 FOREIGN KEY \(IDPETICIONPRUEBA\) REFERENCES TABLE1 \(IDPETICIONPRUEBA\); if i run select t1\.IDPETICIONPRUEBA from table1 t1 join table2 t2 on \(t1\.IDPETICIONPRUEBA = t2\.IDPETICIONPRUEBA \) where t1\.IDPETICION = 1 the plan is PLAN JOIN \(T2 NATURAL, T1 INDEX \(PK\_TABLE1\)\) With this model of tables, firebird allways select the bad plan\.
firebird-automations commented 12 years ago

Commented by: Jesus Angel Garcia Zarco (cointec)

I don't know if this is a bug, but the problem arise when the database is created, data is pumped and after that users insert information in tables. All selectivity index are 0, and then allways is selected a bad plan.

Why when selectivity index is 0 for all then indices, the correct plan is not selected?

firebird-automations commented 12 years ago

Commented by: @dyemanov

A good plan can be selected only if the statistics is up-to-date. You need to execute SET STATISTICS for all the affected indices after any batch data modification. With zero index selectivities, I doubt any clever decisions are possible.

BTW, how big are those tables (t1 and t2)?

firebird-automations commented 12 years ago

Commented by: Jesus Angel Garcia Zarco (cointec)

Table 1 : 49276232 records Table 2: 8210065 records

firebird-automations commented 12 years ago

Commented by: Sean Leyne (seanleyne)

It is very likely that the engine has decided that rather than use the T1 index it is "cheaper" to walk T2 and check each T1 row, after all there are 5 T1 rows for each T2.

What proportion of T1 rows have t1.IDPETICION = 1?

What PLAN is generated for the following query?

select t1.IDPETICIONPRUEBA from table1 t1 where t1.IDPETICION = 1 and EXISTS ( SELECT 1 FROM table2 t2 WHERE t1.IDPETICIONPRUEBA = t2.IDPETICIONPRUEBA )

firebird-automations commented 12 years ago

Commented by: @dyemanov

The problem here is caused by zero selectivities that are useless per se. For a PK index, the safe assumption is that the real selectivity being equal to 1 / cardinality. For other indices (including FK), the default selectivity (equal to 0.1) is assumed. So the only "cheap" join would be the one using PK, hence the chosen plan.

I suppose this situation could be improved by assuming that any FK index has selectivity equal to 1 / master_cardinality (i.e. the same as the master PK selectivity).

firebird-automations commented 12 years ago

Commented by: Jesus Angel Garcia Zarco (cointec)

Sean, it was a simple example. In production i select fields form both tables.

I don't think the engine knows how many records are in both tables to take the decision to use natural plan. When natural plan is selected, the engine does not walk trough five records for each Master record, walks trough millions for each master one.

firebird-automations commented 10 years ago

Commented by: Alvaro Castiello (acastiello)

I was about to report this same bug but I have exactly the same situation:

select * from ESTILOS join ESTILOS_EQ0 on (ESTILOS.ESTILO_ID = ESTILOS_EQ0.ESTILO_ID)

Plan PLAN JOIN (ESTILOS_EQ0 NATURAL, ESTILOS INDEX (ESTILOS_PK))

-----------------------------------------------------script: CREATE TABLE ESTILOS ( ESTILO_ID ESTILO_DEF NOT NULL, ESTILO_MARCA_ID MARCA_DEF NOT NULL, ESTILO_SUBMARCA_ID SUBMARCA_DEF NOT NULL, ESTILO_NOMBRE DATACHAR NOT NULL );

/* Check constraints definition */ ALTER TABLE ESTILOS ADD CONSTRAINT ESTILO_NOMBRE_BLANCO check (ESTILO_NOMBRE <> '');

/******************************************************************************/ /**** Primary Keys ****/ /******************************************************************************/ ALTER TABLE ESTILOS ADD CONSTRAINT ESTILOS_PK PRIMARY KEY (ESTILO_ID);

/******************************************************************************/ /**** Foreign Keys ****/ /******************************************************************************/ ALTER TABLE ESTILOS ADD CONSTRAINT ESTILOS_SUBMARCAS_REF FOREIGN KEY (ESTILO_MARCA_ID, ESTILO_SUBMARCA_ID) REFERENCES SUBMARCAS (MARCA_ID, SUBMARCA_ID) ON UPDATE CASCADE;

/******************************************************************************/ CREATE TABLE ESTILOS_EQ0 ( ESTILO_ID ESTILO_DEF NOT NULL, ESTILO_COMPANIA_ID COMPANIA_DEF NOT NULL, ESTILO_EQUIVALENCIA ESTILO_EQUIVALENCIA_DEF NOT NULL, ESTILO_NOMBRE DATACHAR NOT NULL );

/* Check constraints definition */ ALTER TABLE ESTILOS_EQ0 ADD CONSTRAINT ESTILO_EQUIVALENCIA_BLANCO check (trim(ESTILO_EQUIVALENCIA) != ''); ALTER TABLE ESTILOS_EQ0 ADD CONSTRAINT ESTILO_NOMBRE_BLANCO2 check (trim(ESTILO_NOMBRE) != '');

/******************************************************************************/ /**** Primary Keys ****/ /******************************************************************************/ ALTER TABLE ESTILOS_EQ0 ADD CONSTRAINT ESTILOS_EQ0_PK PRIMARY KEY (ESTILO_ID, ESTILO_COMPANIA_ID);

/******************************************************************************/ /**** Foreign Keys ****/ /******************************************************************************/ ALTER TABLE ESTILOS_EQ0 ADD CONSTRAINT ESTILOS_EQ0_COMPANIAS_REF FOREIGN KEY (ESTILO_COMPANIA_ID) REFERENCES COMPANIAS (COMPANIA_ID) ON UPDATE CASCADE; ALTER TABLE ESTILOS_EQ0 ADD CONSTRAINT ESTILOS_EQ0_MK FOREIGN KEY (ESTILO_ID) REFERENCES ESTILOS (ESTILO_ID) ON UPDATE CASCADE;

/******************************************************************************/ /**** Indices ****/ /******************************************************************************/

CREATE INDEX ESTILOS_EQ0_ESTILO_EQUIVA_IDX ON ESTILOS_EQ0 (ESTILO_EQUIVALENCIA);

firebird-automations commented 9 years ago
Modified by: Sean Leyne (seanleyne) Link: This issue relate to [CORE4666](https://github.com/FirebirdSQL/firebird/issues?q=CORE4666+in%3Atitle) \[ [CORE4666](https://github.com/FirebirdSQL/firebird/issues?q=CORE4666+in%3Atitle) \]