tada / pljava

PL/Java is a free add-on module that brings Java™ Stored Procedures, Triggers, Functions, Aggregates, Operators, Types, etc., to the PostgreSQL™ backend.
http://tada.github.io/pljava/
Other
242 stars 77 forks source link

Idea: demonstration of using pl/java with "TestContainers" #432

Open beargiles opened 1 year ago

beargiles commented 1 year ago

This was mentioned in a recent issue and I wanted to create an issue to track my work on it.

The basic idea is that any pljava implementations should be tested with TAP. That won't change.

However nobody writes an implementation for fun - they write them because they perceive a benefit for running some code on the database server instead of in their application. For instance a task that reads a lot of data but produces a fairly modest output would be a good candidate since it would significantly reduce the required network bandwidth. Doing it on the server could provide significantly better performance while reducing network bandwidth costs.

Another example would be minimizing the exposure of sensitive information. Given proper safeguards at the SQL level (e.g., using a separate schema for the sensitive information and only accessing it via stored procedures with 'AUTHOR' permissions) it's much safer to do the work on the database server than for the application to retrieve the sensitive information. Even if it's encrypted in flight (and the encryption isn't compromised) the application will still need to decrypt it at some point and that risks exposure via the usual JVM and system attacks.

The TAP tests can verify that implementation is providing the expected results. For those tests.

However this doesn't say anything about the application. The application's queries may be flawed, or the permissions may not be set up properly, etc., For that you need functional or integration tests.

The gold standard for functional tests in java is TestContainers and a docker image containing an accurate representation of the targetted server. The application - or more precisely the appropriate abstraction layer for persistent storage - can be tested in realistic conditions. Using docker images means that your tests start in a known state and nothing you do will have permanent effects.

This is why I created postgresql-pljava-docker and docker hub: beargiles/pljava - it's a drop-in replacement for the official PostgreSQL images but already has pl/java installed and configured. (See below)

The docker container even runs a short test to verify the proper functioning but (sigh) it's not yet providing any warning that that test has failed.

The developer still needs to include a small bit of code (to load the implementation jar) when setting up the TestContainer, but it would be trivial to create a Dockerfile that extends these images and does nothing but include that bit of code. The latter would be the best practice... I should include an example of it.

Adding a simple example of this to that repo has been on my list for awhile. I'm currently updating the project so this is a good time to do that. This ticket is mostly to provide a starting place for anyone who wants to set up these tests. (I know - it should also be added to the pljava documentation!)

"Below"

The current implementation was just a first attempt at creating something reproducible. I knew it had serious limitations, not least of which is that it required manual updates and that never gets done.

I'm currently refactoring it so use github CI/CD so that it can keep current with the PostgreSQL docker images. The main obstacle at the moment is that I would prefer to tag the images with the full version of PostgreSQL, e.g., "15.3", but for proper automatic updates I should only use the major version, e.g., "15". (You will always be able to specify "latest" or "14", but I would prefer to include the minor version as well since that will make it easier for people to see if there's been an update.) I'm getting pretty close to the "something is better than nothing" point though...

You're welcome to adopt the project once I've finished this update. In fact it would make a lot of sense since everyone is always asking if there are any existing docker images with pl/java. You have a list but they often use outdated versions of PostgreSQL and/or pljava, and that assumes they can even find the list even though it's clearly listed.

beargiles commented 1 year ago

I'm close to pushing something - it's at coyotesong/postgresql-pgxn-docker. It's currently on the "copy-existing" branch but it will probably be merged into "main" by the time you read this.

It's a maven project that can build custom docker images AND use them in integration tests via TestContainers.

Note: the docker bits are copied from my existing beargiles/postgres-pljava-docker repo, but the docker images can now be built using maven instead of using a custom script. That should allow us to use this approach when developing pl/java extensions, not just applications that use a pl/java-aware docker image.