pietermartin / sqlg

TinkerPop graph over sql
MIT License
246 stars 51 forks source link

Add support for read only connection pool #251

Closed pietermartin closed 5 years ago

pietermartin commented 6 years ago

We need to be able to connect to the graph with a read only user. The topology also needs functionality to grant/revoke a user read only access to schemas and tables. Perhaps including some config to auto grant read only access to new schema/tables on creation.

jgustie commented 6 years ago

I hit this as well, in particular where the PostgresDialect.prepareDB method requires owner privileges to set standard_conforming_strings.

mohamedabdelbary commented 6 years ago

Are there plans for this to be added? In our context we would like to be able to share connections for data fetch purposes as we have a limited connection pool

pietermartin commented 6 years ago

Yes there are plans however not on the immediate horizon. It will happen on the 2.0.0 partition branch.

pietermartin commented 6 years ago

@mohamedabdelbary I am about to start on this. Can you explain a bit more what you mean by "share connections for data fetch purposes as we have a limited connection pool"?

mohamedabdelbary commented 6 years ago

@pietermartin Sure. So we use a Postgresql DB to store a graph as node and edge tables. We have a graph traversal Gremlin library that performs data fetch for different traversal types, exposed via the Gremlin Server. We serve other production services using other tables in the same DB. As such, each service is only allowed a limited max connection pool to guard against services potentially using too many connections.

We suspect that because every connection in sqlg is r/w, different callers cannot share connections for read-only purposes (the graph traversal service doesn't need to do any writes), which leads eventually to the server getting stuck as connections are exhausted. The errors we get are generic "Connection reset by peer" errors with traces to Netty, so the r/w aspect is not necessarily the root cause of the issue and it could be something internal to the Gremlin Server, but we suspect it could still be a factor

Here's an example stack trace

[ERROR] HttpGremlinEndpointHandler - Error processing HTTP Request
java.io.IOException: Connection reset by peer
    at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
    at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
    at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
    at sun.nio.ch.IOUtil.read(IOUtil.java:192)
    at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
    at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:221)
    at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:899)
    at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:275)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:652)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:575)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:489)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:451)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:140)
    at java.lang.Thread.run(Thread.java:745)
[WARN] Slf4JLogger - An exception 'java.lang.NullPointerException' [enable DEBUG level for full stacktrace] was thrown by a user handler's exceptionCaught() method while handling the following exception:
java.io.IOException: Connection reset by peer
    at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
    at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
    at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
    at sun.nio.ch.IOUtil.read(IOUtil.java:192)
    at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
    at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:221)
    at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:899)
    at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:275)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:652)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:575)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:489)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:451)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:140)
    at java.lang.Thread.run(Thread.java:745)
[ERROR] HttpGremlinEndpointHandler - Error processing HTTP Request
java.io.IOException: Connection reset by peer
    at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
    at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
    at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
    at sun.nio.ch.IOUtil.read(IOUtil.java:192)
    at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
    at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:221)
    at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:899)
    at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:275)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:652)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:575)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:489)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:451)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:140)
    at java.lang.Thread.run(Thread.java:745)
[WARN] Slf4JLogger - An exception 'java.lang.NullPointerException' [enable DEBUG level for full stacktrace] was thrown by a user handler's exceptionCaught() method while handling the following exception:
java.io.IOException: Connection reset by peer
    at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
    at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
    at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
    at sun.nio.ch.IOUtil.read(IOUtil.java:192)
    at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
    at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:221)
    at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:899)
    at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:275)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:652)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:575)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:489)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:451)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:140)
    at java.lang.Thread.run(Thread.java:745)
pietermartin commented 6 years ago

Ok, so my basic idea is just to make sure SqlgGraph can work with a read only connection. There are some startup sql that currently needs r/w. SqlgGraph will not carry any responsibility for the r/w. It will be up to the db to manage.

So you can start up SqlgGraph with a read only user and set the connection pool max size to some value.

For the Connection reset by peer are you seeing corresponding errors in the postgresql logs? Not seeing anything in the stack trace about jdbc.

mohamedabdelbary commented 6 years ago

Sounds good. We can test off your development branch when you have a first version. On the logs question, I don't recall if we had any corresponding errors in postgresql, will need to check with other members in the team and get back to you

pietermartin commented 6 years ago

Not much to do. Only the prepareDb @jgustie mentioned. The TestReadOnlyRole class shows how one can use the TopologyListener to grant privileges as schemas or tables are created.

pietermartin commented 6 years ago

This is working now. A separate SqlgGraph that points to a read-only connection pool needs to be used.

mohamedabdelbary commented 6 years ago

@pietermartin that's great thanks. So to test this on our end do we use the partitioning branch?

pietermartin commented 6 years ago

Yes, the partition branch is ready, just need to do more testing of upgrading existing db(s). So don't not use it yet for production.

mohamedabdelbary commented 5 years ago

@pietermartin was this merged into master? and if so which release is it on?

pietermartin commented 5 years ago

Yes it was merged into master. Its on 2.0.0

pietermartin commented 5 years ago

Maybe this will help a bit. It is how I create a read only user on Postgresql

CREATE ROLE "cm_read_only" WITH LOGIN PASSWORD 'cm_read_only';
REVOKE CREATE ON SCHEMA public FROM "cm_read_only";
GRANT USAGE ON SCHEMA "R_OSBC" TO "cm_read_only";
GRANT SELECT ON ALL TABLES IN SCHEMA "R_OSBC" TO "cm_read_only";
mohamedabdelbary commented 5 years ago

@pietermartin so I started testing against this, and I'm getting some errors on startup of the graph with a read/only user. The root cause seems to be that the user still needs some write access to the sqlg_schema schema.

e.g. trace

Caused by: org.postgresql.util.PSQLException: ERROR: permission denied for schema sqlg_schema
    at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2433)
    at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2178)
    at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:306)
    at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:441)
    at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:365)
    at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:307)
    at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:293)
    at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:270)
    at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:266)
    at com.mchange.v2.c3p0.impl.NewProxyStatement.execute(NewProxyStatement.java:75)
    at org.umlg.sqlg.structure.SqlgStartupManager.createOrUpdateGraph(SqlgStartupManager.java:187)

So is it the case that the r/o user for the data schema still needs write access to the sqlg_schema?

pietermartin commented 5 years ago

No the r/o user should not need write access to sqlg_schema. I worked on that createOrUpdateGraph fairly recently and probably broke it. I'll have a look and also look at creating r/o test cases as this should be covered by the tests.

pietermartin commented 5 years ago

It looks like you are starting up new version of SqlgGraph. The code that is failing is where Sqlg is trying to update/upgrade its own meta data tables in sqlg_schema. In your case it is trying to add a sqlg_schema.V_Graph table.

You will have to start SqlgGraph with a r/w user to do the upgrade, and then restart with the readOnly user.

Below is the code that shows where Sqlg is trying to upgrade itself.

try (ResultSet vertexRs = metadata.getTables(null, Schema.SQLG_SCHEMA, Topology.VERTEX_PREFIX + Topology.GRAPH, types)) {
    if (!vertexRs.next()) {
        try (Statement statement = conn.createStatement()) {
            String sql = this.sqlDialect.sqlgCreateTopologyGraph();
            statement.execute(sql);  //line 187
            TopologyManager.addGraph(this.sqlgGraph, version);
            oldVersion = version;
        }
        ...
pietermartin commented 5 years ago

Open this issue if the problem persist.