Closed Algunenano closed 6 years ago
Referencing https://github.com/CartoDB/cartodb-postgresql/issues/300 to give a bit more of context.
cc'ing @andy-esch and @jgoizueta for awareness
Thanks @rafatower. From Paul's comment in CartoDB/cartodb-postgresql#300:
Almost everything can be marked parallel. If it doesn't read from (or write to) a table, it can be marked.
Some of the PL/Python functions internally use plpy.execute('select ....')
to gather data from tables. For example, CDB_AreasOfInterestLocal
relies on this here. Based on that comment, are they still eligible to be marked parallel?
Technically speaking, reading shouldn't be an issue as long as you are sure you are only reading. For example, in the PR (#183) I've marked for now a lot of python functions as unsafe because they execute an external subquery that we don't know anything about. For example, you could pass CDB_AreasOfInterestGlobal something like this as the first parameter:
WITH t AS (
UPDATE products SET price = price * 1.05
RETURNING *
)
SELECT * FROM t;
This is not parallelizable (and not STABLE either). One possible solution is to also provide a signature that receives directly the data instead of the query; another one is to blindly trust the user and hope it sends a PARALLEL SAFE subquery.
Deploying 0.6.1 to test in Staging. There was a couple of issues with the servers (https://github.com/CartoDB/cartodb-platform/issues/3737) but I've updated a single user and it seems to be working fine:
cartodb_staging_user_9824ba31-2cbc-4be5-b723-9bde90d5b883_db=# \df+ cdb_crankshaft.cdb_crankshaft_version
List of functions
Schema | Name | Result data type | Argument data types | Type | Volatility | Parallel | Owner | Security | Access privileges | Language | Source code | Description
----------------+------------------------+------------------+---------------------+--------+------------+----------+----------+----------+-------------------+----------+-------------------------+-------------
cdb_crankshaft | cdb_crankshaft_version | text | | normal | immutable | safe | postgres | invoker | | sql | +|
| | | | | | | | | | | SELECT '0.6.1'::text;+|
| | | | | | | | | | | |
(1 row)
cartodb_staging_user_9824ba31-2cbc-4be5-b723-9bde90d5b883_db=# Select cdb_crankshaft.cdb_crankshaft_version();
cdb_crankshaft_version
------------------------
0.6.1
(1 row
Once the issue is resolved I'll update all users in staging and, if no issues appear, deploy to production.
cc/ @rafatower
As https://github.com/CartoDB/cartodb-postgresql/issues/300, with pg9.6+ there is the option to mark functions as PARALLEL (different levels) so the analyzer can decide to run the query in several processes simultaneously. In a similar fashion, marking the VOLATILE category enables optimizations to avoid re-running queries (or some parts of a query).
Should maintain compatibility with previous PostgreSQL releases (at least for now).