Closed awoods187 closed 5 years ago
Here's another repro
1h30m23s 0 0.0 0.0 0.0 0.0 0.0 0.0 7
1h30m23s 0 0.0 0.0 0.0 0.0 0.0 0.0 8
1h30m23s 0 0.0 0.0 0.0 0.0 0.0 0.0 9
I190305 14:28:23.490711 132 workload/tpch/tpch.go:208 [10] return 20 rows after 38.95 seconds:
SET DISTSQL = 'always';
SELECT
c_custkey,
c_name,
sum(l_extendedprice * (1 - l_discount)) AS revenue,
c_acctbal,
n_name,
c_address,
c_phone,
c_comment
FROM
customer,
orders,
lineitem,
nation
WHERE
c_custkey = o_custkey
AND l_orderkey = o_orderkey
AND o_orderDATE >= DATE '1993-10-01'
AND o_orderDATE < DATE '1993-10-01' + INTERVAL '3' MONTH
AND l_returnflag = 'R'
AND c_nationkey = n_nationkey
GROUP BY
c_custkey,
c_name,
c_acctbal,
c_phone,
n_name,
c_address,
c_comment
ORDER BY
revenue DESC
LIMIT 20;
Error: pq: sum(): unsupported binary operator: <float> * <int>
Error: exit status 1
This is a planning error and should thus be deterministic. I am surprised (and a bit unhappy) it is not.
However I suspect this is what is happening: depending on the current table statistics, the optimizer may choose to either eliminate or transform a part of the query, and the error is only raised after that point depending on which plan is selected.
Would be curious what the outcome is with set optimizer = off
.
root@localhost:26257/tpch> SET optimizer = 'off';
SET
Time: 486.695µs
SELECT
c_custkey,
c_name,
sum(l_extendedprice * (1 - l_discount)) AS revenue,
c_acctbal,
n_name,
c_address,
c_phone,
c_comment
FROM
customer,
orders,
lineitem,
nation
WHERE
c_custkey = o_custkey
AND l_orderkey = o_orderkey
AND o_orderDATE >= DATE '1993-10-01'
AND o_orderDATE < DATE '1993-10-01' + INTERVAL '3' MONTH
AND l_returnflag = 'R'
AND c_nationkey = n_nationkey
GROUP BY
c_custkey,
c_name,
c_acctbal,
c_phone,
n_name,
c_address,
c_comment
ORDER BY
revenue DESC
LIMIT 20;
c_custkey | c_name | revenue | c_acctbal | n_name | c_address | c_phone | c_comment
+-----------+--------------------+-------------------+-----------+----------------+------------------------------------------+-----------------+------------------------------------------------------------------------------------------------------------------+
57040 | Customer#000057040 | 734235.2455 | 632.87 | JAPAN | Eioyzjf4pp | 22-895-641-3466 | sits. slyly regular requests sleep alongside of the regular inst
143347 | Customer#000143347 | 721002.6948 | 2557.47 | EGYPT | 1aReFYv,Kw4 | 14-742-935-3718 | ggle carefully enticing requests. final deposits use bold, bold pinto beans. ironic, idle re
60838 | Customer#000060838 | 679127.3077 | 2454.77 | BRAZIL | 64EaJ5vMAHWJlBOxJklpNc2RJiWE | 12-913-494-9813 | need to boost against the slyly regular account
101998 | Customer#000101998 | 637029.5666999999 | 3790.89 | UNITED KINGDOM | 01c9CILnNtfOQYmZj | 33-593-865-6378 | ress foxes wake slyly after the bold excuses. ironic platelets are furiously carefully bold theodolites
125341 | Customer#000125341 | 633508.0859999999 | 4983.51 | GERMANY | S29ODD6bceU8QSuuEJznkNaK | 17-582-695-5962 | arefully even depths. blithely even excuses sleep furiously. foxes use except the dependencies. ca
25501 | Customer#000025501 | 620269.7849 | 7725.04 | ETHIOPIA | W556MXuoiaYCCZamJI,Rn0B4ACUGdkQ8DZ | 15-874-808-6793 | he pending instructions wake carefully at the pinto beans. regular, final instructions along the slyly fina
115831 | Customer#000115831 | 596423.8672000001 | 5098.1 | FRANCE | rFeBbEEyk dl ne7zV5fDrmiq1oK09wV7pxqCgIc | 16-715-386-3788 | l somas sleep. furiously final deposits wake blithely regular pinto b
84223 | Customer#000084223 | 594998.0239 | 528.65 | UNITED KINGDOM | nAVZCs6BaWap rrM27N 2qBnzc5WBauxbA | 33-442-824-8191 | slyly final deposits haggle regular, pending dependencies. pending escapades wake
54289 | Customer#000054289 | 585603.3918 | 5583.02 | IRAN | vXCxoCsU0Bad5JQI ,oobkZ | 20-834-292-4707 | ely special foxes are quickly finally ironic p
39922 | Customer#000039922 | 584878.1133999999 | 7321.11 | GERMANY | Zgy4s50l2GKN4pLDPBU8m342gIw6R | 17-147-757-8036 | y final requests. furiously final foxes cajole blithely special platelets. f
6226 | Customer#000006226 | 576783.7606 | 2230.09 | UNITED KINGDOM | 8gPu8,NPGkfyQQ0hcIYUGPIBWc,ybP5g, | 33-657-701-3391 | ending platelets along the express deposits cajole carefully final
922 | Customer#000000922 | 576767.5333 | 3869.25 | GERMANY | Az9RFaut7NkPnc5zSD2PwHgVwr4jRzq | 17-945-916-9648 | luffily fluffy deposits. packages c
147946 | Customer#000147946 | 576455.132 | 2030.13 | ALGERIA | iANyZHjqhyy7Ajah0pTrYyhJ | 10-886-956-3143 | ithely ironic deposits haggle blithely ironic requests. quickly regu
115640 | Customer#000115640 | 569341.1933 | 6436.1 | ARGENTINA | Vtgfia9qI 7EpHgecU1X | 11-411-543-4901 | ost slyly along the patterns; pinto be
73606 | Customer#000073606 | 568656.8578 | 1785.67 | JAPAN | xuR0Tro5yChDfOCrjkd2ol | 22-437-653-6966 | he furiously regular ideas. slowly
110246 | Customer#000110246 | 566842.9814999999 | 7763.35 | VIETNAM | 7KzflgX MDOq7sOkI | 31-943-426-9837 | egular deposits serve blithely above the fl
142549 | Customer#000142549 | 563537.2368000001 | 5085.99 | INDONESIA | ChqEoK43OysjdHbtKCp6dKqjNyvvi9 | 19-955-562-2398 | sleep pending courts. ironic deposits against the carefully unusual platelets cajole carefully express accounts.
146149 | Customer#000146149 | 557254.9864999999 | 1791.55 | ROMANIA | s87fvzFQpU | 29-744-164-6487 | of the slyly silent accounts. quickly final accounts across the
52528 | Customer#000052528 | 556397.3509 | 551.79 | ARGENTINA | NFztyTOR10UOJ | 11-208-192-3205 | deposits hinder. blithely pending asymptotes breach slyly regular re
23431 | Customer#000023431 | 554269.5360000001 | 3381.86 | ROMANIA | HgiV0phqhaIa9aydNoIlb | 29-915-458-2654 | nusual, even instructions: furiously stealthy n
(20 rows)
Time: 30.019428583s
Are you saying you always get results (i.e. no error whatsoever) with set optimizer=off
? that's what you need to determine.
I just ran it 15 times in a row manually with the opt turned on and it worked every time in 4 seconds.
I ran it 10 times in a row manually with the opt turned off and it worked every time in 30 seconds.
So far, I've hit it 4 times when running roachprod run $CLUSTER:4 "./workload run tpch --queries 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22 --scale-factor=1 {pgurl:1-3}"
with the optimizer turned on. Maybe its something about how stats work or something about having run in the context of other queries?
For those following along i repro'd this with the opt=off with roachprod run $CLUSTER:4 "./workload run tpch --queries 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22 --scale-factor=1 {pgurl:1-3}"
I ran into this problem about three weeks ago when working on https://github.com/cockroachdb/cockroach/pull/34891. I think I was constantly hitting the error on a single node on my laptop.
Turns out that the logging is misleading. The error is with Q11, not Q10. Q11 will always fail because CRDB doesn't allow a float to be multiplied by an int. The fix is simple: explicitly cast the int to a float. I'll open a PR that makes the change to the TPCH workload.
awesome find! Glad this can be closed
Describe the problem
When running TPC-H I received the following error:
But when I reproduce on a new cluster with the exact same setup:
To Reproduce
export CLUSTER=andy-hints roachprod create $CLUSTER -n 2 --gce-machine-type=n1-standard-16 roachprod run $CLUSTER -- 'sudo umount /mnt/data1; sudo mount -o discard,defaults,nobarrier /dev/sdb /mnt/data1/; mount | grep /mnt/data1'
roachprod stage $CLUSTER:1 cockroach roachprod stage $CLUSTER:2 workload roachprod start $CLUSTER:1 roachprod sql $CLUSTER:1 CREATE DATABASE tpch; RESTORE tpch.* FROM 'gs://cockroach-fixtures/workload/tpch/scalefactor=1/backup' WITH into_db = 'tpch'; roachprod adminurl --open $CLUSTER:1 roachprod run $CLUSTER:2 "./workload run tpch --queries 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22 --scale-factor=1 {pgurl:1}"
Expected behavior I would expect the query to always work or never work.
Environment: v19.1.0-beta.20190304-135-gd8f7e85