elixir-ecto / postgrex

PostgreSQL driver for Elixir
http://hexdocs.pm/postgrex/
1.12k stars 275 forks source link

PostGrex sends a System Table Info request on connect which is crashing ArcadeDB #718

Closed jonmdev closed 1 week ago

jonmdev commented 1 week ago

Elixir version

IEx 1.17.3 (compiled with Erlang/OTP 27)

Database and Version

ArcadeDB current

Postgrex Version

{:postgrex, "~> 0.19.3"}

Current behavior

I posted a bug report here regarding attempting to use this in generic connection to ArcadeDB which supports PostGres connection protocol.

Error

The connection attempt is crashing with the message:

2024-11-17 02:12:54.214 INFO  [ArcadeDBServer] <ArcadeDB_0> ArcadeDB Server started in 'development' mode (CPUs=8 MAXRAM=7.80GB)
2024-11-17 02:12:54.232 INFO  [ArcadeDBServer] <ArcadeDB_0> Studio web tool available at http://172.30.97.77:2480
2024-11-17 02:46:37.916 INFO  [PostgresNetworkExecutor] PSQL:-> request for password (R - 8b) (thread=36)
2024-11-17 02:46:38.009 INFO  [PostgresNetworkExecutor] PSQL:-> authentication ok (R - 8b) (thread=36)
2024-11-17 02:46:38.010 INFO  [PostgresNetworkExecutor] PSQL:-> backend key data (K - 12b) (thread=36)
2024-11-17 02:46:38.013 INFO  [PostgresNetworkExecutor] PSQL:-> parameter status (S - 24b) (thread=36)
2024-11-17 02:46:38.014 INFO  [PostgresNetworkExecutor] PSQL:-> parameter status (S - 25b) (thread=36)
2024-11-17 02:46:38.015 INFO  [PostgresNetworkExecutor] PSQL:-> parameter status (S - 25b) (thread=36)
2024-11-17 02:46:38.016 INFO  [PostgresNetworkExecutor] PSQL:-> ready for query (Z - 5b) (thread=36)
2024-11-17 02:46:38.055 INFO  [PostgresNetworkExecutor] PSQL: query -> SELECT t.oid, t.typname, t.typsend, t.typreceive, t.typoutput, t.typinput,
       coalesce(d.typelem, t.typelem), coalesce(r.rngsubtype, 0), ARRAY (
  SELECT a.atttypid
  FROM pg_attribute AS a
  WHERE a.attrelid = t.typrelid AND a.attnum > 0 AND NOT a.attisdropped
  ORDER BY a.attnum
)

FROM pg_type AS t
LEFT JOIN pg_type AS d ON t.typbasetype = d.oid
LEFT JOIN pg_range AS r ON r.rngtypid = t.oid OR (t.typbasetype <> 0 AND r.rngtypid = t.typbasetype)
WHERE (t.typrelid = 0)
AND (t.typelem = 0 OR NOT EXISTS (SELECT 1 FROM pg_catalog.pg_type s WHERE s.typrelid != 0 AND s.oid = t.typelem)) (thread=36)
2024-11-17 02:46:38.142 INFO  [PostgresNetworkExecutor] PSQL:-> ready for query (Z - 5b) (thread=36)
2024-11-17 02:46:38.182 INFO  [PostgresNetworkExecutor] PSQL: Received cancel request pid 0
2024-11-17 02:46:38.184 INFO  [PostgresNetworkExecutor] PSQL: Canceling session 0Error on reading any message: Socket closed
com.arcadedb.postgres.PostgresProtocolException: Error on reading any message: Socket closed
        at com.arcadedb.postgres.PostgresNetworkExecutor.readMessage(PostgresNetworkExecutor.java:1077)
        at com.arcadedb.postgres.PostgresNetworkExecutor.run(PostgresNetworkExecutor.java:149)
Caused by: java.net.SocketException: Socket closed
        at java.base/java.net.SocketInputStream.socketRead0(Native Method)
        at java.base/java.net.SocketInputStream.socketRead(SocketInputStream.java:115)
        at java.base/java.net.SocketInputStream.read(SocketInputStream.java:168)
        at java.base/java.net.SocketInputStream.read(SocketInputStream.java:140)
        at java.base/java.io.BufferedInputStream.fill(BufferedInputStream.java:252)
        at java.base/java.io.BufferedInputStream.read(BufferedInputStream.java:271)
        at java.base/java.io.DataInputStream.readUnsignedByte(DataInputStream.java:293)
        at com.arcadedb.network.binary.ChannelBinary.readUnsignedByte(ChannelBinary.java:64)
        at com.arcadedb.postgres.PostgresNetworkExecutor.readNextByte(PostgresNetworkExecutor.java:1088)
        at com.arcadedb.postgres.PostgresNetworkExecutor.readMessage(PostgresNetworkExecutor.java:1045)
        ... 1 more

2024-11-17 02:46:38.188 SEVER [PostgresNetworkExecutor] PSQL: Closing connection with client

Interpretation

The interpretation of this issue over there was:

So, this gives a hint. Apparantly the following SQL query is send by the Elixir driver:

SELECT t.oid, t.typname, t.typsend, t.typreceive, t.typoutput, t.typinput,
       coalesce(d.typelem, t.typelem), coalesce(r.rngsubtype, 0), ARRAY (
  SELECT a.atttypid
  FROM pg_attribute AS a
  WHERE a.attrelid = t.typrelid AND a.attnum > 0 AND NOT a.attisdropped
  ORDER BY a.attnum
)

FROM pg_type AS t
LEFT JOIN pg_type AS d ON t.typbasetype = d.oid
LEFT JOIN pg_range AS r ON r.rngtypid = t.oid OR (t.typbasetype <> 0 AND r.rngtypid = t.typbasetype)
WHERE (t.typrelid = 0)
AND (t.typelem = 0 OR NOT EXISTS (SELECT 1 FROM pg_catalog.pg_type s WHERE s.typrelid != 0 AND s.oid = t.typelem))

which uses the Postgres system "tables" pg_attribute, pg_type, pg_range and pg_catalog. I guess this causes the execeptions on the ArcadeDB side, as these don't exist, as well as the JOINs.

Maybe there is a way to use Postgrex without this startup query?

Test Code

https://github.com/jonmdev/postgres_bug

Inside application.ex

  @impl true
  def start(_type, _args) do

    #===============================
    #POSTGREX TEST (ADDED CODE)
    #===============================
    children = [{
      Postgrex,
      hostname: "localhost",
      port: 5432,
      username: "root",
      password: "password",
      database: "mydb",
      parameters: [{:preferQueryMode, "simple" }]
    }]

    opts = [strategy: :one_for_one, name: My.PostgresSupervisor] # restart options for the supervisor, not relevant
    start_result = Supervisor.start_link(children, opts)  # start the Postgres connection based on tuple configuration above

    IO.puts("STARTED POSTGRES WITH STATUS: " <> inspect(start_result))

Ideas?

What do you think? Do you think there is any way to solve this on the Elixir side? If so how? Or would you recommend a solution on the Arcade side? Either way how would you see that best potentially working?

If it is just a matter of not sending this generic request on connection which is crashing it, is there any way to turn that off?

Thanks for any ideas.

Expected behavior

Ideally can connect to generic Postgres adapter interfaced database.

josevalim commented 1 week ago

Unfortunately it is not possible at the moment. We use the binary protocol instead of the text protocol, and it requires querying all types in the database so we know how to encode extensions. One option would be for us to hardcode the result for the built-in types and this way but extensions then won't work, which I am unsure if it a problem in your case.

greg-rychlewski commented 1 week ago

We have the simple connection. If we are bootstrapping types during that one maybe we could set a flag to bypass it?

jonmdev commented 1 week ago

One option would be for us to hardcode the result for the built-in types and this way but extensions then won't work, which I am unsure if it a problem in your case. We have the simple connection. If we are bootstrapping types during that one maybe we could set a flag to bypass it?

I am wondering if either of you @greg-rychlewski or @josevalim can perhaps guide me in an experiment to see if trying these ideas will fix the problem (ie. don't send the initialization message, and hardcode in the types).

I have set up a method to test this further if you have any ideas or can explain how it might work. I would be eager to test them.

ArcadeDB Background

For background, here are the relevant bits of the documentation :

ArcadeDB Server supports a subset of the Postgres wire protocol, such as connection and queries. Particularly, ArcadeDB does only support "simple" query mode and does not support SSL! Some tools compatible with Postgres may execute queries on internal Postgres tables to retrieve the schema. Those tables are not present in ArcadeDB, so it may return errors at startup.

Test Method

To test this further I updated my bug project by adding the Postgrex package .ex files directly so they can be manipulated: https://github.com/jonmdev/postgres_bug

Now I believe my bug project runs the same but directly on the package code added to my project itself (so it can be freely edited or played with).

Errors

Interestingly, they are both giving new errors with this configuration. I am guessing the Postgrex package output error differently whether added as a package or directly to the code. Or perhaps I added it wrong as code. But I don't think so.

I am getting:

1) Arcade

2024-11-18 23:06:31.882 INFO  [PostgresNetworkExecutor] PSQL:-> request for password (R - 8b) (thread=52)
2024-11-18 23:06:31.884 INFO  [PostgresNetworkExecutor] PSQL:-> authentication ok (R - 8b) (thread=52)
2024-11-18 23:06:31.884 INFO  [PostgresNetworkExecutor] PSQL:-> backend key data (K - 12b) (thread=52)
2024-11-18 23:06:31.885 INFO  [PostgresNetworkExecutor] PSQL:-> parameter status (S - 24b) (thread=52)
2024-11-18 23:06:31.885 INFO  [PostgresNetworkExecutor] PSQL:-> parameter status (S - 25b) (thread=52)
2024-11-18 23:06:31.885 INFO  [PostgresNetworkExecutor] PSQL:-> parameter status (S - 25b) (thread=52)
2024-11-18 23:06:31.885 INFO  [PostgresNetworkExecutor] PSQL:-> ready for query (Z - 5b) (thread=52)Exception in thread "ArcadeDB-postgres//127.0.0.1" com.arcadedb.postgres.PostgresProtocolException: Error on parsing startup message
        at com.arcadedb.postgres.PostgresNetworkExecutor.readStartupMessage(PostgresNetworkExecutor.java:989)
        at com.arcadedb.postgres.PostgresNetworkExecutor.run(PostgresNetworkExecutor.java:114)
Caused by: java.io.EOFException
        at java.base/java.io.DataInputStream.readInt(DataInputStream.java:397)
        at com.arcadedb.network.binary.ChannelBinary.readUnsignedInt(ChannelBinary.java:79)
        at com.arcadedb.postgres.PostgresNetworkExecutor.readStartupMessage(PostgresNetworkExecutor.java:925)
        ... 1 more
Exception in thread "ArcadeDB-postgres//127.0.0.1" Exception in thread "ArcadeDB-postgres//127.0.0.1" Exception in thread "ArcadeDB-postgres//127.0.0.1" com.arcadedb.postgres.PostgresProtocolException: Error on parsing startup message
        at com.arcadedb.postgres.PostgresNetworkExecutor.readStartupMessage(PostgresNetworkExecutor.java:989)
        at com.arcadedb.postgres.PostgresNetworkExecutor.run(PostgresNetworkExecutor.java:114)
Caused by: java.io.EOFException
        at java.base/java.io.DataInputStream.readInt(DataInputStream.java:397)
        at com.arcadedb.network.binary.ChannelBinary.readUnsignedInt(ChannelBinary.java:79)
        at com.arcadedb.postgres.PostgresNetworkExecutor.readStartupMessage(PostgresNetworkExecutor.java:925)
        ... 1 more

2) Elixir

STARTED POSTGRES WITH STATUS: {:ok, #PID<0.182.0>}
Interactive Elixir (1.17.3) - press Ctrl+C to exit (type h() ENTER for help)

23:24:22.063 [error] :gen_statem #PID<0.185.0> terminating
** (ArgumentError) unknown registry: Postgrex.TypeManager
    (elixir 1.17.3) lib/registry.ex:1400: Registry.key_info!/1
    (elixir 1.17.3) lib/registry.ex:590: Registry.lookup/2
    (postgres_bug 0.1.0) lib/postgrex/type_supervisor.ex:23: Postgrex.TypeSupervisor.locate/2
    (postgres_bug 0.1.0) lib/postgrex/protocol.ex:1087: Postgrex.Protocol.bootstrap/3
    (postgres_bug 0.1.0) lib/postgrex/protocol.ex:729: Postgrex.Protocol.handshake/3
    (postgres_bug 0.1.0) lib/postgrex/protocol.ex:209: Postgrex.Protocol.connect_endpoints/6
    (db_connection 2.7.0) lib/db_connection/connection.ex:74: DBConnection.Connection.handle_event/4
    (stdlib 6.1.1) gen_statem.erl:3737: :gen_statem.loop_state_callback/11
Queue: [internal: {:connect, :init}]
Postponed: []
State: Postgrex.Protocol
Callback mode: :handle_event_function, state_enter: false

23:24:22.515 [error] :gen_statem #PID<0.191.0> terminating

Idea to remove this init message (or manually set in what needs to go in)?

I am not sure where this message originates from in the code. I think it is in protocol.ex at line 199 connect_endpoints with connect_and_handshake which is defined further at line 249 with connect_and_handshake which is defined further at line 723 with handshake, and most specifically, line 732 with ref = Postgrex.Parameters.insert(parameters).

I am guessing that is the bottom line there? Is that what I want to override? Ie. If I can (1) stop sending the query for this to the database in case that is triggering the issue, and (2) manually set in something myself for the parameters if that is what that is, then maybe I can make it work?

Any general ideas for how this could go? If it is solvable, perhaps the solution may be generalized into a new option for the Postgrex system in these types of cases. Or not. I am not sure. Either way I appreciate any guesses or conjecture as I am sure you both understand all this far better than me. 🙏

jonmdev commented 1 week ago

Never mind actually. I was able to connect via Rust (which can be wrapped in Elixir). I will connect that way or via a different protocol. Thanks.