Documentation comments for sandboxed and unsandboxed grants in pljava.policy inverted

kurtraschke commented 2 months ago

In pljava.policy, the comments describing the grants for sandboxed and unsandboxed environments appear to be reversed.

That is, the comment beginning "This grant defines the mapping onto Java of PostgreSQL's 'trusted language'..." appears before the grant for org.postgresql.pljava.PLPrincipal$Sandboxed, and the comment beginning "This grant defines the mapping onto Java of PostgreSQL's 'untrusted language'..." appears before the grant for org.postgresql.pljava.PLPrincipal$Unsandboxed.

These appear to be backwards (i.e. untrusted, or javau is the sandboxed environment, and trusted, or java is the unsandboxed environment).

Observed with PL/Java 1.6.7 on Postgres 16.2 (both from the official Postgres Debian package repository).

jcflack commented 2 months ago

Thanks for the report. The comments are correct as written; what's tricky is that the way PostgreSQL uses the terms 'trusted' and 'untrusted' is about equally easy to read in the intended way, or in the exact opposite way!

The way PostgreSQL uses those terms, trusted is used for a language that non-superusers can be allowed to write routines in, because the language will prevent those routines from doing Bad Things. They will be confined (sandboxed) to doing only a limited range of safe things.

An 'untrusted' language is one that does not impose such limits on what a routine can do. Because a routine in an 'untrusted' language could do Bad Things, only superusers can be allowed to create routines in that language.

So a "trusted" language is a language you can "trust" (because it is sandboxed), and therefore you can make it available to users you don't (have to) "trust" (as much). And an "untrusted" language is a language you don't "trust" (because it isn't so well sandboxed), so you have to only make it available to superusers whom you really "trust".

If you are not an old PostgreSQL hand, you might think a "trusted" language is one that might be allowed to do Bad Things (because you "trust" it), and an "untrusted" language is the one that wouldn't be allowed to (because you don't "trust" it). But that's the reverse of the way PostgreSQL uses the terms.

There is a little paragraph in the documentation touching on that, and explaining why PL/Java favors the terms sandboxed/unsandboxed instead, because they are less prone to being accidentally read backwards.

Of course even that nomenclature is approximate because PL/Java in fact uses the Java sandbox in both cases, but just grants a somewhat wider set of default permissions to PLPrincipal$Unsandboxed (and even allows you to create other language aliases besides the original two java and javaU, and assign different permission sets to those).

The only difference in the default policy is that PLPrincipal$Unsandboxed is allowed full access to the filesystem, and in PLPrincipal$Sandboxed that access is blocked.

kurtraschke commented 2 months ago

Thanks, that's...baffling. I mean, I get it (and it's not PL/Java's fault), but that's horrendously counterintuitive. (Also I will admit to having read past that note in the docs, despite having been on that page while working on another issue...)

Would you accept a PR to improve the "at-the-point-of-use" documentation in pljava.policy? Some blindingly obvious comments to the effect of "THIS IS java" and "THIS IS javau" would be most welcome and avoid semantic confusion over trusted/untrusted, sandboxed/unsandboxed, and so on.

In the same vein, I suspect a table like the following, at the top of the docs page you linked (and perhaps a bit more prominent than the existing paragraph) would help:

Default PL/Java language name	Trusted?	Sandboxed?
`java`	Trusted (meaning can be used by non-superusers)	Sandboxed
`javau`	Untrusted (meaning only for superusers)	Unsandboxed

I realize this all goes out the window once folks start creating language aliases, etc., but in the context of "in another 30 seconds I'm going to go look for a 'big hammer' option to disable the Security Manager entirely..." it would help to have a clearer pointer on how things work in the default configuration.

jcflack commented 2 months ago

I would consider such a PR. I might prefer the added comments to be more like "such as 'java'" and "such as 'javaU'", and the column headings in the table to be "PostgreSQL term" and "PL/Java term". (The table is simple enough for the reader to see at a glance one column is trusted/untrusted and one is sandboxed/unsandboxed anyway.) And I'd probably put it right after the existing explanatory paragraph.

The grant clauses for PLPrincipal$Sandboxed and PLPrincipal$Unsandboxed are not only for java and javaU; if you create another alias, you still specify which of the two types it is, and it still enjoys all the permissions from that corresponding grant clause, as well as any grant you might add for that specific alias.

kurtraschke commented 2 months ago

Thanks - yes, I was deliberately being a bit brutish in my examples, but it could be put more delicately. Once I've tidied up the project I am working on now, I'll try to circle back to this.

jcflack commented 2 months ago

Yes, delicate is good, as after all, it's not as if the PostgreSQL usage of the terms doesn't make sense. It does make sense. It's just one of those near-perfect examples of English that can be read the way you mean or the opposite way and both make sense.

jcflack commented 1 month ago

Another example of how easy it is for trusted/untrusted to have their meanings not just sort of blurred but reversed:

'trusted'/'untrusted' PL in DoD/DISA PostgreSQL STIGs

tada / pljava

Documentation comments for sandboxed and unsandboxed grants in pljava.policy inverted #483