Closed wildtayne closed 6 years ago
Thanks @srrollo for making the changes. I have the following observations.
General:
public
all lower case instead of capitalizing the first letter, simply because people are used to seeing that particular schema's name in lower case. That said, I think we should continue to use PUBLIC
all upper case in grant/revoke statements.ClassDB.IDNameDomain
NAME
type needs to be cast to ClassDB.IDNameDomain
only in a function call and no where else. For example, the cast on L45 is not required. (I wish I could codify the casting behavior.)ALTER
statements in this script can fit in just one line: In general, I recommend placing code in as few lines as possible without sacrificing clarity.PUBLIC
(for example, L61-L63) is not necessary because the grant being made is the default.SECURITY INVOKER
and let the DBMS accept/reject the query on INFORMATION_SCHEMA
. No permission check such as the one on L45-L48 is needed.Function listTables
:
NULL
is the better default value for parameter schemaName
, and then use ClassDB.getSchemaName(SESSION_USER::ClassDB.IDNameDomain)
to get schema name if schemaName
is NULL
ClassDB.foldPgID
on the RHS value.Function describe(tableName)
:
describe(schemaName, tableName)
SESSION_USER
must be cast to ClassDB.IDNameDomain
.Question: Could the SET LOCAL
statement on L28 be moved up to the first executable line in this script? If yes, I think we should make this change (in all scripts as we touch them) so we don't inadvertently add some code above that causes notices we want suppressed.
Nice to have: It will be nice to have the 1-arg describe
function handle schema-qualified table names, for example, pvfc.customer_t
. We have discussed this possibility in the past, but we should not implement it in M2. Instead, maybe we should add an issue so it gets attention in a later release. (It will make a perfect issue for new contributors to cut their teeth.)
On second thought, the default value of parameter schemaName
in function listTables
should be CURRENT_SCHEMA
. It should also coalesce to CURRENT_SCHEMA
.
Likewise, the 1-arg describe
function should lookup the table in CURRENT_SCHEMA
.
Love to hear your thoughts.
Thanks @smurths for the review. The new commits up to 5c57674 address most of the points raised. I have some more thoughts below:
SECURITY DEFINER
, since they need to execute functions in the ClassDB
schema. For example, the function:
CREATE OR REPLACE FUNCTION public.permissionTest()
RETURNS VARCHAR(63) AS
$$
BEGIN
SELECT ClassDB.foldPgID('MYId');
END;
$$ LANGUAGE plpgsql;
ALTER FUNCTION public.permissionTest() OWNER TO ClassDB;
Returns the error `ERROR: permission denied for schema classdb` when executed as a student. This is also why the schema access check is required. I do have some other ideas on how to solve this problem if this solution has significant issues, namely either creating a proxy function `public.foldPgID()` for students, or granting `USAGE` on `ClassDB` to students, and only giving them execute on a few functions. I like the proxy function solution, but we may have to write a few, since it seems like we will at least want `ClassDB.getSchemaName()` as well.
- I think the best default parameter would be to use `ClassDB.getSchemaName(CURRENT_USER::ClassDB.IDNameDomain)`, and `COALESCE` the parameter with `ClassDB.getSchemaName(CURRENT_USER::ClassDB.IDNameDomain)` as well. It seems this would avoid the situation where many students are able to just use `public.listTables()` because their schema names match their user names, but one student has to remember to input their non-matching schema name. I imagine that could be pretty confusing to the students.
- `SET LOCAL` can only be [used in a transaction](https://www.postgresql.org/docs/9.6/static/sql-set.html), so it has to appear below `START TRANSACTION`. I recall there being issues with using `SET SESSION` for the same purpose, but I don't remember the details. Maybe @afig can weigh in on this.
Thanks @srrollo for the analysis.
I feel the first thing to check is what happens if a student user directly queries info schema to list and describe tables. Any solution we provide will depend on that behavior because we simply want to provide an abstraction to those direct queries.
It appears that row-level access control is used to prevent users from accessing info schema metadata about objects they do not have permission to access. For example, I have several student accounts on my test installation. If I query INFORMATION_SCHEMA.SCHEMATA
as student01
, only the schemas, student01
, public
, INFORMATION_SCHEMA
, and pg_catalog
are shown. And if I query INFORMATION_SCHEMA.TABLES
, only tables from those schemas are shown. However, if I wrap the query in a function owned by ClassDB
with SECURITY DEFINER
, the student can use the function to access metadata about objects in other schemas, such as student02
.
Thanks @srrollo for confirming what I was suspecting when I wrote earlier "let the DBMS accept/reject the query on INFORMATION_SCHEMA".
I believe the following:
SECURITY INVOKER
CURRENT_SCHEMA
(technically, the default should respect the schema search path, but that is for another day)foldPgID
should be duplicated in public
schema with a clear note in both places that duplicate definitions exist and that any maintenance should be done at both placesFor the next release, we should think about both the utility and the placement of these functions.
Thinking about it more, I think the plan by @smurthys is a better solution. Replacing built-in security checks with custom ones is never a good idea.
I've addressed most of the issues discussed in this thread and our discussion.
One final issue I have is that students also won't be able to access ClassDB.IDNameDomain
since the functions are now SECURITY INVOKER
. Should I switch the functions back to using VARCHAR(63), or duplicate ClassDB.IDNameDomain
? Perhaps it makes sense to have ClassDB.IDNameDomain
in public anyway.
Ah, yes. We have to use VARCHAR(63)
here.
OK, I've reverted to VARCHAR(63)
.
I have the following observations on commit e9804d2:
foldPgID
needs RETURN NULL ON NULL INPUT
NULL
to to CURRENT_SCHEMA
current user's schema
isn't quite right. I believe it should say invoker's current schema
.ClassDB.FoldPgID
is incorrect.VARCHAR
necessary? especially to VARCHAR(63)
?For some reason, the function call on L101 will not coerce NAME
to VARCHAR
. I get the following error at compile time:
function public.describe(name, character varying) does not exist
It seems Postgres tries to call using NAME
instead of coercing to VARCHAR
. I did the cast to VARCHAR(63)
since public.describe()
takes VARCHAR(63)
, but just VARCHAR
works as well.
I think this is related to the issue with coercing NAME
to ClassDB.IDNameDomain
.
I've just pushed the commits that address the latest points brought up by @smurthys.
Overall looks great, student users can access these functions correctly without having to specify a schema in my testing. One possible overall improvement is better informational output, but that's clearly for another PR and milestone.
Just a couple of minor observations (most of the issues behind these notes were pre-existing):
START TRANSACTION
instead of BEGIN TRANSACTION
to begin a transaction (they are functionally equivalent)Thanks @afig for the feedback. I've fixed the issues pointed out. I think for now, I feel better leaving your attribution as-is, since you did write pgFoldID()
, and you do have some miscellaneous contributions to this file in the commit history.
The changes look good. I like how folding and coalescing are combined on L66.
BTW, it seems the folding on L102 is not required because that value gets folded in the 2-arg describe
function.
Related to this PR only because of the duplicated foldPgID
: I wonder if foldPgID
should TRIM
its parameter prior to SELECT
so the callers don't have to:
$1 = TRIM($1);
should doplpgsql
SELECT
becomes RETURN
I ask this question because the result of foldPgID
is almost always used in an equality predicate, and clearly Postgres trims non-quoted identifiers.
Thanks @smurthys for the comments. I've pushed a commit removing the unnecessary call to foldPgID()
. I agree that TRIM
can be used in foldPgID()
, however we may want to address it in another branch so we can double check the solution and make sure pgFoldID()
is updated in both places.
This PR contains commits that fix #110. The catalog management functions are now owned by
ClassDB
, so they can execute all helper functions in theClassDB
schema, includingClassDB.foldPgId()
. Additionally, checks have been added to ensure users cannot access objects in schemas they do not own with these functions, andPublic.describe(VARCHAR)
now gets the user's schema usingClassDB.getSchemaName()
.I would particularly like to know if
ClassDB.IDNameDomain
should be used overVARCHAR
in these functions, and if I am missing any casts when using functions that useClassDB.IDNameDomain
.Fixes #110