Open trasa opened 7 years ago
Hello there, this is my fault!
non-IMMUTABLE functions could potentially have side-effects. A function such as nextval()
, for instance, uses and changes state which is local to the node. The problem is that Citus runs across multiple nodes. If you're using a replication factor of 2, which of those nodes should nextval()
run on? What do you do when the shards are moved to different nodes?
To solve this, we run all non-IMMUTABLE functions on the coordinator node and rewrite the query such that the worker nodes only see constants or IMMUTABLE functions. In the query you provided: coalesce(end_date, now() + interval '1' day)
, whether or not we run the function depends on the value of a column which the coordinator has no way of seeing! We can't always run the function, that would break the contract coalesce provides. We could potentially whitelist now()
, but in general the query you provided is impossible to run without adding lot more coordination (and therefore loss of speed).
I've tried to come up with an alternative for you which could run in one query but failed because we don't support CTEs or sub-queries inside of DML. I'll ping someone from our support team and hopefully they'll reach out to you soon.
Hey @trasa,
I hope @lithp's explanations satisfied you about why we do not directly support such queries. I also come up with a workaround. Would such query work for you?
UPDATE
sample_table
SET
amount = 50
WHERE
id = '2a10f3d8-e492-4582-9d35-f731e91fc937' AND
root_id = 1 AND
now() BETWEEN coalesce(start_date, '1-1-1900') AND coalesce(end_date, 'infinity');
Thanks for the indepth explanation @lithp !! that does explain what's going on and why it must be that way. @byucesoy yes, the 'infinity' workaround is perfect and perhaps a more correct solution.
thanks!
@trasa I'm glad that workaround worked for you. If you use 1-1-1900 as an arbitrary old date, you can also use -infinitiy instead. I did not use it in the comment above in case it has special meaning for your application.
Yes, -infinity would be a more appropriate choice. Thank you!
On Mon, Apr 10, 2017 at 1:46 AM, Burak Yücesoy notifications@github.com wrote:
@trasa https://github.com/trasa I'm glad that workaround worked for you. If you use 1-1-1900 as an arbitrary old date, you can also use -infinitiy instead. I did not use it in the comment above in case it has special meaning for your application.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/citusdata/citus/issues/1313#issuecomment-292874123, or mute the thread https://github.com/notifications/unsubscribe-auth/ABDSIOz8a8D0H1E-S694VoZB6H1jaILsks5rud5MgaJpZM4M2RTm .
I have the same problem. Please give me some advice.
UPDATE
ss_prod p SET
poplr_dgr = case PROD_TP_CD when 'M01004' then ('-'|| r.poplr_dgr )::bigint else r.poplr_dgr::bigint end,
poplr_dgr_strm = case PROD_TP_CD when 'M01004' then ('-'|| r.poplr_dgr_strm )::bigint else r.poplr_dgr_strm::bigint end,
poplr_dgr_lngtm = case PROD_TP_CD when 'M01004' then ('-'|| r.poplr_dgr_lngtm )::bigint else r.poplr_dgr_lngtm::bigint end FROM (
SELECT
SPLIT_PART(x._data, E'\\t', 1)::bigint nv_mid,
SPLIT_PART(x._data, E'\\t', 2)::bigint poplr_dgr_strm,
SPLIT_PART(x._data, E'\\t', 2)::bigint poplr_dgr,
SPLIT_PART(x._data, E'\\t', 4)::bigint poplr_dgr_lngtm
FROM ( SELECT UNNEST(STRING_TO_ARRAY('112\t451\t345\t234§4235\t634567\t3457354\t67356§7546\t745675\t264562\t4356§1132\t4251\t3245\t2234§42335\t6344567\t34557354\t667356', CHr(167) ) ) AS _data ) x
) r WHERE p.nv_mid = r.nv_mid and p.nv_mid IN(
'12341234'::bigint ,
'1234123412'::bigint ,
'1234123'::bigint ,
'12341234'::bigint ,
'12341234'::bigint );
SQL Error [0A000]: ERROR: non-IMMUTABLE functions are not allowed in CASE or COALESCE statements
I've solved it. Thank you.^^
UPDATE
ss_prod p SET
poplr_dgr = case PROD_TP_CD when 'M01004' then -1 else 1 end * r.poplr_dgr,
poplr_dgr_strm = case PROD_TP_CD when 'M01004' then -1 else 1 end * r.poplr_dgr_strm,
poplr_dgr_lngtm = case PROD_TP_CD when 'M01004' then -1 else 1 end * r.poplr_dgr_lngtm
FROM (
SELECT
SPLIT_PART(x._data, E'\\t', 1)::bigint nv_mid,
SPLIT_PART(x._data, E'\\t', 2)::bigint poplr_dgr_strm,
SPLIT_PART(x._data, E'\\t', 2)::bigint poplr_dgr,
SPLIT_PART(x._data, E'\\t', 4)::bigint poplr_dgr_lngtm
FROM ( SELECT UNNEST(STRING_TO_ARRAY('112\t451\t345\t234§4235\t634567\t3457354\t67356§7546\t745675\t264562\t4356§1132\t4251\t3245\t2234§42335\t6344567\t34557354\t667356', CHr(167) ) ) AS _data ) x
) r WHERE p.nv_mid = r.nv_mid and p.nv_mid IN(
'12341234'::bigint ,
'1234123412'::bigint ,
'1234123'::bigint ,
'12341234'::bigint ,
'12341234'::bigint )
Executing this SQL gives an error on a sharded database (table is distributed on root_id):
If I run this on a plain old PSQL instance, it executes just fine. There's probably better ways to determine if now is between start and end (or if they're null), of course. Though it does seem like a common pattern.