cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
29.92k stars 3.78k forks source link

sql: non-deterministic planning error, invalid types in scalar expression #35418

Closed awoods187 closed 5 years ago

awoods187 commented 5 years ago

Describe the problem

When running TPC-H I received the following error:

SELECT
    c_custkey,
    c_name,
    sum(l_extendedprice * (1 - l_discount)) AS revenue,
    c_acctbal,
    n_name,
    c_address,
    c_phone,
    c_comment
FROM
    customer,
    orders,
    lineitem,
    nation
WHERE
    c_custkey = o_custkey
    AND l_orderkey = o_orderkey
    AND o_orderDATE >= DATE '1993-10-01'
    AND o_orderDATE < DATE '1993-10-01' + INTERVAL '3' MONTH
    AND l_returnflag = 'R'
    AND c_nationkey = n_nationkey
GROUP BY
    c_custkey,
    c_name,
    c_acctbal,
    c_phone,
    n_name,
    c_address,
    c_comment
ORDER BY
    revenue DESC
LIMIT 20;
Error: pq: sum(): unsupported binary operator: <float> * <int>
Error:  exit status 1

But when I reproduce on a new cluster with the exact same setup:

SELECT
    c_custkey,
    c_name,
    sum(l_extendedprice * (1 - l_discount)) AS revenue,
    c_acctbal,
    n_name,
    c_address,
    c_phone,
    c_comment
FROM
    customer,
    orders,
    lineitem,
    nation
WHERE
    c_custkey = o_custkey
    AND l_orderkey = o_orderkey
    AND o_orderDATE >= DATE '1993-10-01'
    AND o_orderDATE < DATE '1993-10-01' + INTERVAL '3' MONTH
    AND l_returnflag = 'R'
    AND c_nationkey = n_nationkey
GROUP BY
    c_custkey,
    c_name,
    c_acctbal,
    c_phone,
    n_name,
    c_address,
    c_comment
ORDER BY
    revenue DESC
LIMIT 20;

  c_custkey |       c_name       |      revenue      | c_acctbal |     n_name     |                c_address                 |     c_phone     |                                                    c_comment
+-----------+--------------------+-------------------+-----------+----------------+------------------------------------------+-----------------+------------------------------------------------------------------------------------------------------------------+
      57040 | Customer#000057040 | 734235.2455000001 |    632.87 | JAPAN          | Eioyzjf4pp                               | 22-895-641-3466 | sits. slyly regular requests sleep alongside of the regular inst
     143347 | Customer#000143347 | 721002.6947999999 |   2557.47 | EGYPT          | 1aReFYv,Kw4                              | 14-742-935-3718 | ggle carefully enticing requests. final deposits use bold, bold pinto beans. ironic, idle re
      60838 | Customer#000060838 | 679127.3077000001 |   2454.77 | BRAZIL         | 64EaJ5vMAHWJlBOxJklpNc2RJiWE             | 12-913-494-9813 |  need to boost against the slyly regular account
     101998 | Customer#000101998 |       637029.5667 |   3790.89 | UNITED KINGDOM | 01c9CILnNtfOQYmZj                        | 33-593-865-6378 | ress foxes wake slyly after the bold excuses. ironic platelets are furiously carefully bold theodolites
     125341 | Customer#000125341 | 633508.0860000001 |   4983.51 | GERMANY        | S29ODD6bceU8QSuuEJznkNaK                 | 17-582-695-5962 | arefully even depths. blithely even excuses sleep furiously. foxes use except the dependencies. ca
      25501 | Customer#000025501 | 620269.7849000001 |   7725.04 | ETHIOPIA       |   W556MXuoiaYCCZamJI,Rn0B4ACUGdkQ8DZ     | 15-874-808-6793 | he pending instructions wake carefully at the pinto beans. regular, final instructions along the slyly fina
     115831 | Customer#000115831 | 596423.8672000001 |    5098.1 | FRANCE         | rFeBbEEyk dl ne7zV5fDrmiq1oK09wV7pxqCgIc | 16-715-386-3788 | l somas sleep. furiously final deposits wake blithely regular pinto b
      84223 | Customer#000084223 | 594998.0238999999 |    528.65 | UNITED KINGDOM | nAVZCs6BaWap rrM27N 2qBnzc5WBauxbA       | 33-442-824-8191 |  slyly final deposits haggle regular, pending dependencies. pending escapades wake
      54289 | Customer#000054289 | 585603.3918000001 |   5583.02 | IRAN           | vXCxoCsU0Bad5JQI ,oobkZ                  | 20-834-292-4707 | ely special foxes are quickly finally ironic p
      39922 | Customer#000039922 | 584878.1133999999 |   7321.11 | GERMANY        | Zgy4s50l2GKN4pLDPBU8m342gIw6R            | 17-147-757-8036 | y final requests. furiously final foxes cajole blithely special platelets. f
       6226 | Customer#000006226 |       576783.7606 |   2230.09 | UNITED KINGDOM | 8gPu8,NPGkfyQQ0hcIYUGPIBWc,ybP5g,        | 33-657-701-3391 | ending platelets along the express deposits cajole carefully final
        922 | Customer#000000922 |       576767.5333 |   3869.25 | GERMANY        | Az9RFaut7NkPnc5zSD2PwHgVwr4jRzq          | 17-945-916-9648 | luffily fluffy deposits. packages c
     147946 | Customer#000147946 |        576455.132 |   2030.13 | ALGERIA        | iANyZHjqhyy7Ajah0pTrYyhJ                 | 10-886-956-3143 | ithely ironic deposits haggle blithely ironic requests. quickly regu
     115640 | Customer#000115640 | 569341.1932999999 |    6436.1 | ARGENTINA      | Vtgfia9qI 7EpHgecU1X                     | 11-411-543-4901 | ost slyly along the patterns; pinto be
      73606 | Customer#000073606 |       568656.8578 |   1785.67 | JAPAN          | xuR0Tro5yChDfOCrjkd2ol                   | 22-437-653-6966 | he furiously regular ideas. slowly
     110246 | Customer#000110246 |       566842.9815 |   7763.35 | VIETNAM        | 7KzflgX MDOq7sOkI                        | 31-943-426-9837 | egular deposits serve blithely above the fl
     142549 | Customer#000142549 | 563537.2368000001 |   5085.99 | INDONESIA      | ChqEoK43OysjdHbtKCp6dKqjNyvvi9           | 19-955-562-2398 | sleep pending courts. ironic deposits against the carefully unusual platelets cajole carefully express accounts.
     146149 | Customer#000146149 |       557254.9865 |   1791.55 | ROMANIA        | s87fvzFQpU                               | 29-744-164-6487 |  of the slyly silent accounts. quickly final accounts across the
      52528 | Customer#000052528 | 556397.3508999998 |    551.79 | ARGENTINA      | NFztyTOR10UOJ                            | 11-208-192-3205 |  deposits hinder. blithely pending asymptotes breach slyly regular re
      23431 | Customer#000023431 |        554269.536 |   3381.86 | ROMANIA        | HgiV0phqhaIa9aydNoIlb                    | 29-915-458-2654 | nusual, even instructions: furiously stealthy n
(20 rows)

Time: 4.620199746s

To Reproduce

export CLUSTER=andy-hints roachprod create $CLUSTER -n 2 --gce-machine-type=n1-standard-16 roachprod run $CLUSTER -- 'sudo umount /mnt/data1; sudo mount -o discard,defaults,nobarrier /dev/sdb /mnt/data1/; mount | grep /mnt/data1'

roachprod stage $CLUSTER:1 cockroach roachprod stage $CLUSTER:2 workload roachprod start $CLUSTER:1 roachprod sql $CLUSTER:1 CREATE DATABASE tpch; RESTORE tpch.* FROM 'gs://cockroach-fixtures/workload/tpch/scalefactor=1/backup' WITH into_db = 'tpch'; roachprod adminurl --open $CLUSTER:1 roachprod run $CLUSTER:2 "./workload run tpch --queries 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22 --scale-factor=1 {pgurl:1}"

Expected behavior I would expect the query to always work or never work.

Environment: v19.1.0-beta.20190304-135-gd8f7e85

awoods187 commented 5 years ago

Here's another repro

1h30m23s        0            0.0            0.0      0.0      0.0      0.0      0.0 7
1h30m23s        0            0.0            0.0      0.0      0.0      0.0      0.0 8
1h30m23s        0            0.0            0.0      0.0      0.0      0.0      0.0 9
I190305 14:28:23.490711 132 workload/tpch/tpch.go:208  [10] return 20 rows after 38.95 seconds:
  SET DISTSQL = 'always';
SELECT
    c_custkey,
    c_name,
    sum(l_extendedprice * (1 - l_discount)) AS revenue,
    c_acctbal,
    n_name,
    c_address,
    c_phone,
    c_comment
FROM
    customer,
    orders,
    lineitem,
    nation
WHERE
    c_custkey = o_custkey
    AND l_orderkey = o_orderkey
    AND o_orderDATE >= DATE '1993-10-01'
    AND o_orderDATE < DATE '1993-10-01' + INTERVAL '3' MONTH
    AND l_returnflag = 'R'
    AND c_nationkey = n_nationkey
GROUP BY
    c_custkey,
    c_name,
    c_acctbal,
    c_phone,
    n_name,
    c_address,
    c_comment
ORDER BY
    revenue DESC
LIMIT 20;
Error: pq: sum(): unsupported binary operator: <float> * <int>
Error:  exit status 1
knz commented 5 years ago

This is a planning error and should thus be deterministic. I am surprised (and a bit unhappy) it is not.

However I suspect this is what is happening: depending on the current table statistics, the optimizer may choose to either eliminate or transform a part of the query, and the error is only raised after that point depending on which plan is selected.

Would be curious what the outcome is with set optimizer = off.

awoods187 commented 5 years ago

root@localhost:26257/tpch> SET optimizer = 'off';
SET

Time: 486.695µs
SELECT
    c_custkey,
    c_name,
    sum(l_extendedprice * (1 - l_discount)) AS revenue,
    c_acctbal,
    n_name,
    c_address,
    c_phone,
    c_comment
FROM
    customer,
    orders,
    lineitem,
    nation
WHERE
    c_custkey = o_custkey
    AND l_orderkey = o_orderkey
    AND o_orderDATE >= DATE '1993-10-01'
    AND o_orderDATE < DATE '1993-10-01' + INTERVAL '3' MONTH
    AND l_returnflag = 'R'
    AND c_nationkey = n_nationkey
GROUP BY
    c_custkey,
    c_name,
    c_acctbal,
    c_phone,
    n_name,
    c_address,
    c_comment
ORDER BY
    revenue DESC
LIMIT 20;
  c_custkey |       c_name       |      revenue      | c_acctbal |     n_name     |                c_address                 |     c_phone     |                                                    c_comment
+-----------+--------------------+-------------------+-----------+----------------+------------------------------------------+-----------------+------------------------------------------------------------------------------------------------------------------+
      57040 | Customer#000057040 |       734235.2455 |    632.87 | JAPAN          | Eioyzjf4pp                               | 22-895-641-3466 | sits. slyly regular requests sleep alongside of the regular inst
     143347 | Customer#000143347 |       721002.6948 |   2557.47 | EGYPT          | 1aReFYv,Kw4                              | 14-742-935-3718 | ggle carefully enticing requests. final deposits use bold, bold pinto beans. ironic, idle re
      60838 | Customer#000060838 |       679127.3077 |   2454.77 | BRAZIL         | 64EaJ5vMAHWJlBOxJklpNc2RJiWE             | 12-913-494-9813 |  need to boost against the slyly regular account
     101998 | Customer#000101998 | 637029.5666999999 |   3790.89 | UNITED KINGDOM | 01c9CILnNtfOQYmZj                        | 33-593-865-6378 | ress foxes wake slyly after the bold excuses. ironic platelets are furiously carefully bold theodolites
     125341 | Customer#000125341 | 633508.0859999999 |   4983.51 | GERMANY        | S29ODD6bceU8QSuuEJznkNaK                 | 17-582-695-5962 | arefully even depths. blithely even excuses sleep furiously. foxes use except the dependencies. ca
      25501 | Customer#000025501 |       620269.7849 |   7725.04 | ETHIOPIA       |   W556MXuoiaYCCZamJI,Rn0B4ACUGdkQ8DZ     | 15-874-808-6793 | he pending instructions wake carefully at the pinto beans. regular, final instructions along the slyly fina
     115831 | Customer#000115831 | 596423.8672000001 |    5098.1 | FRANCE         | rFeBbEEyk dl ne7zV5fDrmiq1oK09wV7pxqCgIc | 16-715-386-3788 | l somas sleep. furiously final deposits wake blithely regular pinto b
      84223 | Customer#000084223 |       594998.0239 |    528.65 | UNITED KINGDOM | nAVZCs6BaWap rrM27N 2qBnzc5WBauxbA       | 33-442-824-8191 |  slyly final deposits haggle regular, pending dependencies. pending escapades wake
      54289 | Customer#000054289 |       585603.3918 |   5583.02 | IRAN           | vXCxoCsU0Bad5JQI ,oobkZ                  | 20-834-292-4707 | ely special foxes are quickly finally ironic p
      39922 | Customer#000039922 | 584878.1133999999 |   7321.11 | GERMANY        | Zgy4s50l2GKN4pLDPBU8m342gIw6R            | 17-147-757-8036 | y final requests. furiously final foxes cajole blithely special platelets. f
       6226 | Customer#000006226 |       576783.7606 |   2230.09 | UNITED KINGDOM | 8gPu8,NPGkfyQQ0hcIYUGPIBWc,ybP5g,        | 33-657-701-3391 | ending platelets along the express deposits cajole carefully final
        922 | Customer#000000922 |       576767.5333 |   3869.25 | GERMANY        | Az9RFaut7NkPnc5zSD2PwHgVwr4jRzq          | 17-945-916-9648 | luffily fluffy deposits. packages c
     147946 | Customer#000147946 |        576455.132 |   2030.13 | ALGERIA        | iANyZHjqhyy7Ajah0pTrYyhJ                 | 10-886-956-3143 | ithely ironic deposits haggle blithely ironic requests. quickly regu
     115640 | Customer#000115640 |       569341.1933 |    6436.1 | ARGENTINA      | Vtgfia9qI 7EpHgecU1X                     | 11-411-543-4901 | ost slyly along the patterns; pinto be
      73606 | Customer#000073606 |       568656.8578 |   1785.67 | JAPAN          | xuR0Tro5yChDfOCrjkd2ol                   | 22-437-653-6966 | he furiously regular ideas. slowly
     110246 | Customer#000110246 | 566842.9814999999 |   7763.35 | VIETNAM        | 7KzflgX MDOq7sOkI                        | 31-943-426-9837 | egular deposits serve blithely above the fl
     142549 | Customer#000142549 | 563537.2368000001 |   5085.99 | INDONESIA      | ChqEoK43OysjdHbtKCp6dKqjNyvvi9           | 19-955-562-2398 | sleep pending courts. ironic deposits against the carefully unusual platelets cajole carefully express accounts.
     146149 | Customer#000146149 | 557254.9864999999 |   1791.55 | ROMANIA        | s87fvzFQpU                               | 29-744-164-6487 |  of the slyly silent accounts. quickly final accounts across the
      52528 | Customer#000052528 |       556397.3509 |    551.79 | ARGENTINA      | NFztyTOR10UOJ                            | 11-208-192-3205 |  deposits hinder. blithely pending asymptotes breach slyly regular re
      23431 | Customer#000023431 | 554269.5360000001 |   3381.86 | ROMANIA        | HgiV0phqhaIa9aydNoIlb                    | 29-915-458-2654 | nusual, even instructions: furiously stealthy n
(20 rows)

Time: 30.019428583s
knz commented 5 years ago

Are you saying you always get results (i.e. no error whatsoever) with set optimizer=off? that's what you need to determine.

awoods187 commented 5 years ago

I just ran it 15 times in a row manually with the opt turned on and it worked every time in 4 seconds.

I ran it 10 times in a row manually with the opt turned off and it worked every time in 30 seconds.

So far, I've hit it 4 times when running roachprod run $CLUSTER:4 "./workload run tpch --queries 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22 --scale-factor=1 {pgurl:1-3}" with the optimizer turned on. Maybe its something about how stats work or something about having run in the context of other queries?

awoods187 commented 5 years ago

For those following along i repro'd this with the opt=off with roachprod run $CLUSTER:4 "./workload run tpch --queries 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22 --scale-factor=1 {pgurl:1-3}"

yuzefovich commented 5 years ago

I ran into this problem about three weeks ago when working on https://github.com/cockroachdb/cockroach/pull/34891. I think I was constantly hitting the error on a single node on my laptop.

andy-kimball commented 5 years ago

Turns out that the logging is misleading. The error is with Q11, not Q10. Q11 will always fail because CRDB doesn't allow a float to be multiplied by an int. The fix is simple: explicitly cast the int to a float. I'll open a PR that makes the change to the TPCH workload.

awoods187 commented 5 years ago

awesome find! Glad this can be closed