postgresml / postgresml

Postgres with GPUs for ML/AI apps.
https://postgresml.org
MIT License
5.96k stars 296 forks source link

Postgresml and plpython3u cannot coexist #1484

Open maparent opened 4 months ago

maparent commented 4 months ago

Tried both with my laptop (Apple silicon) homebrew postgresql-15 (with latest postgresml, master tip) and with the ghcr.io/postgresml/postgresml:2.8.2 image (also arm64).

Giving the docker steps:

docker run \
                   -it \
                   -v postgresml_data:/var/lib/postgresql \
                   -p 5433:5432 \
                   -p 8000:8000 \
                   ghcr.io/postgresml/postgresml:2.8.2 \
                   sh

apt update
apt install postgresql-plpython3-15
sudo -u postgresml psql -d postgresml

create extension plpython3u;
CREATE FUNCTION basic() RETURNS text AS 'return "hello"' LANGUAGE plpython3u;

And the Postgres server crashes

server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.
!?>

I did not suspect the interaction at first, so I got deep into plpython; the crash happens in PLy_initialize, at the moment of calling PyImport_ImportModule("plpy"). Setting up memory debug says there is an allocation without holding the GIL. Importing any module crashes in the same way, and we never get to the plpy initialization function.

My suspicion is that the interpreter created by pyo3 and the one created by plpython use the same structures.

maparent commented 4 months ago

ah forgot to give the server log:

2024-05-26 15:25:54.722 UTC [111] postgresml@postgresml STATEMENT:  create extension plpython3u;
2024-05-26 15:26:43.614 UTC [28] LOG:  server process (PID 111) was terminated by signal 11: Segmentation fault
2024-05-26 15:26:43.614 UTC [28] DETAIL:  Failed process was running: CREATE FUNCTION basic() RETURNS text AS 'return "hello"' LANGUAGE plpython3u;
2024-05-26 15:26:43.614 UTC [28] LOG:  terminating any other active server processes
2024-05-26 15:26:43.618 UTC [115] postgresml@postgresml FATAL:  the database system is in recovery mode
2024-05-26 15:26:43.622 UTC [28] LOG:  all server processes terminated; reinitializing
2024-05-26 15:26:43.630 UTC [28] FATAL:  Can't attach, lock is not in an empty state: PgLwLockInner
2024-05-26 15:26:43.631 UTC [28] LOG:  database system is shut down
levkk commented 4 months ago

Take a look at #931. We'll need to add a Mac config to use lld.

maparent commented 4 months ago

Ah, thanks for the prompt response! I searched for plpython in issues but not in PRs. Puzzled, though, that it affects the ubuntu docker image, which #931 explicitly targets (even on aarch64)?

maparent commented 4 months ago

Tried on the Mac side to set

linker = "/opt/homebrew/opt/llvm/bin/lld"
rustflags = ["-C", "link-args=-undefined dynamic_lookup"]

and the behaviour is the same.

levkk commented 4 months ago

I'm not an expert on this, but pretty sure this is how you pass those args: https://github.com/postgresml/postgresml/pull/931/files#diff-baaeadb5ff4f591645577d7c7dd11631c67c4a6bdd23aca709d086c158fc4ed7R9

maparent commented 4 months ago

Yes, I passed the flags this way, I can tell they were used, and it did not solve the problem.

JamesInform commented 3 months ago

@PostgresML Team: It seems that you are not experiencing the described issue.

Question: Is there a docker file or docker yaml or can you provide a script on how to build the extension correctly on Debian 12.5?

The documentation is a bit weak on many aspects.

I understand that you primarily want to sell your services, but community help is really appreciated.

levkk commented 3 months ago

Hi there,

Currently, we only support Ubuntu 22.04 for open source deployments due to limited resourcing on our side (we're a relatively young startup). For all other environments, we welcome contributions from the community (as well as for Ubuntu).

For the plpython3u issue specifically, there was a fix in the past that has been applied and seemingly worked at the time. None of us use plpython3u regularly (not since PostgresML 1.0 anyway), so we haven't had a chance to take a look at the problem. For this issue as well, we'd welcome a contribution from the community to trace the bug and submit us a patch.

Thanks!

JamesInform commented 3 months ago

Thanks for the open feedback. That's great!

Well I think that your extension could be an easy welcome for people touching ML for the first time, because it eliminates the fiddling around Python itself.

Before I can contribute to the project further than testing, I need to have it at least up and running on my side.

Two question:

  1. Can you provide a docker file for building your complete ubuntu 22.04 docker images? Or is it already downloadable somewhere?
  2. Will there be any functional differences between this MIT project here and your final commercial version?

Cheers, James