citusdata / citus

Distributed PostgreSQL as an extension
https://www.citusdata.com
GNU Affero General Public License v3.0
10.64k stars 670 forks source link

non-IMMUTABLE functions are not allowed in CASE or COALESCE statements #1313

Open trasa opened 7 years ago

trasa commented 7 years ago

Executing this SQL gives an error on a sharded database (table is distributed on root_id):

UPDATE sample_table set amount = 50
WHERE id = '2a10f3d8-e492-4582-9d35-f731e91fc937'
and root_id = 1
AND now() between coalesce(start_date, '1-1-1900') and coalesce(end_date, now() + interval '1' day)
ERROR:  non-IMMUTABLE functions are not allowed in CASE or COALESCE statements

If I run this on a plain old PSQL instance, it executes just fine. There's probably better ways to determine if now is between start and end (or if they're null), of course. Though it does seem like a common pattern.

lithp commented 7 years ago

Hello there, this is my fault!

non-IMMUTABLE functions could potentially have side-effects. A function such as nextval(), for instance, uses and changes state which is local to the node. The problem is that Citus runs across multiple nodes. If you're using a replication factor of 2, which of those nodes should nextval() run on? What do you do when the shards are moved to different nodes?

To solve this, we run all non-IMMUTABLE functions on the coordinator node and rewrite the query such that the worker nodes only see constants or IMMUTABLE functions. In the query you provided: coalesce(end_date, now() + interval '1' day), whether or not we run the function depends on the value of a column which the coordinator has no way of seeing! We can't always run the function, that would break the contract coalesce provides. We could potentially whitelist now(), but in general the query you provided is impossible to run without adding lot more coordination (and therefore loss of speed).

I've tried to come up with an alternative for you which could run in one query but failed because we don't support CTEs or sub-queries inside of DML. I'll ping someone from our support team and hopefully they'll reach out to you soon.

byucesoy commented 7 years ago

Hey @trasa,

I hope @lithp's explanations satisfied you about why we do not directly support such queries. I also come up with a workaround. Would such query work for you?

UPDATE
    sample_table
SET
    amount = 50
WHERE
    id = '2a10f3d8-e492-4582-9d35-f731e91fc937' AND
    root_id = 1 AND
    now() BETWEEN coalesce(start_date, '1-1-1900') AND coalesce(end_date, 'infinity');
trasa commented 7 years ago

Thanks for the indepth explanation @lithp !! that does explain what's going on and why it must be that way. @byucesoy yes, the 'infinity' workaround is perfect and perhaps a more correct solution.

thanks!

byucesoy commented 7 years ago

@trasa I'm glad that workaround worked for you. If you use 1-1-1900 as an arbitrary old date, you can also use -infinitiy instead. I did not use it in the comment above in case it has special meaning for your application.

trasa commented 7 years ago

Yes, -infinity would be a more appropriate choice. Thank you!

On Mon, Apr 10, 2017 at 1:46 AM, Burak Yücesoy notifications@github.com wrote:

@trasa https://github.com/trasa I'm glad that workaround worked for you. If you use 1-1-1900 as an arbitrary old date, you can also use -infinitiy instead. I did not use it in the comment above in case it has special meaning for your application.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/citusdata/citus/issues/1313#issuecomment-292874123, or mute the thread https://github.com/notifications/unsubscribe-auth/ABDSIOz8a8D0H1E-S694VoZB6H1jaILsks5rud5MgaJpZM4M2RTm .

zaety777 commented 4 years ago

I have the same problem. Please give me some advice.

UPDATE
    ss_prod p   SET
    poplr_dgr = case PROD_TP_CD when 'M01004' then ('-'|| r.poplr_dgr )::bigint else r.poplr_dgr::bigint end,
    poplr_dgr_strm = case PROD_TP_CD when 'M01004' then ('-'|| r.poplr_dgr_strm )::bigint else r.poplr_dgr_strm::bigint end,
    poplr_dgr_lngtm = case PROD_TP_CD when 'M01004' then ('-'|| r.poplr_dgr_lngtm )::bigint else r.poplr_dgr_lngtm::bigint end   FROM (
    SELECT
     SPLIT_PART(x._data, E'\\t', 1)::bigint nv_mid,
     SPLIT_PART(x._data, E'\\t', 2)::bigint poplr_dgr_strm,
     SPLIT_PART(x._data, E'\\t', 2)::bigint poplr_dgr,
     SPLIT_PART(x._data, E'\\t', 4)::bigint poplr_dgr_lngtm
    FROM ( SELECT UNNEST(STRING_TO_ARRAY('112\t451\t345\t234§4235\t634567\t3457354\t67356§7546\t745675\t264562\t4356§1132\t4251\t3245\t2234§42335\t6344567\t34557354\t667356', CHr(167) ) )  AS _data ) x
    ) r   WHERE p.nv_mid = r.nv_mid   and p.nv_mid   IN(
    '12341234'::bigint   ,
    '1234123412'::bigint   ,
    '1234123'::bigint   ,
    '12341234'::bigint   ,
    '12341234'::bigint   );

SQL Error [0A000]: ERROR: non-IMMUTABLE functions are not allowed in CASE or COALESCE statements
zaety777 commented 4 years ago

I've solved it. Thank you.^^

UPDATE
    ss_prod p   SET
    poplr_dgr = case PROD_TP_CD when 'M01004' then -1 else 1 end * r.poplr_dgr,
    poplr_dgr_strm = case PROD_TP_CD when 'M01004' then -1 else  1 end * r.poplr_dgr_strm,
    poplr_dgr_lngtm = case PROD_TP_CD when 'M01004' then -1 else 1 end * r.poplr_dgr_lngtm
FROM (
    SELECT
     SPLIT_PART(x._data, E'\\t', 1)::bigint nv_mid,
     SPLIT_PART(x._data, E'\\t', 2)::bigint poplr_dgr_strm,
     SPLIT_PART(x._data, E'\\t', 2)::bigint poplr_dgr,
     SPLIT_PART(x._data, E'\\t', 4)::bigint poplr_dgr_lngtm
    FROM ( SELECT UNNEST(STRING_TO_ARRAY('112\t451\t345\t234§4235\t634567\t3457354\t67356§7546\t745675\t264562\t4356§1132\t4251\t3245\t2234§42335\t6344567\t34557354\t667356', CHr(167) ) )  AS _data ) x
    ) r   WHERE p.nv_mid = r.nv_mid   and p.nv_mid   IN(
    '12341234'::bigint   ,
    '1234123412'::bigint   ,
    '1234123'::bigint   ,
    '12341234'::bigint   ,
    '12341234'::bigint   )