Segfault-Inc / Multicorn

Data Access Library
https://multicorn.org/
PostgreSQL License
701 stars 145 forks source link

Enhancement : allow running a multicorn FDW in a virtualenv #168

Open max-l opened 7 years ago

max-l commented 7 years ago

Multicorn requires that modules to be installed in the hosts global python path.

This is error prone, because if my module has a dependency on a particular version of a module that conflicts with a version that exists in the global python installation and that is required by some other program on the host, it results in dependency hell.

It would be nice to have more isolation from the host by allowing to specify the path of a virtualenv (https://pypi.python.org/pypi/virtualenv) Perhaps multicorn as a whole could could run in this virtualenv, or there could be more fine grained control, where each multicorn modules could run in their own virtualenv. The later is of course more flexible than the former.

phrrngtn commented 7 years ago

I have just started experimenting with Multicorn and I found that running pg_ctl start from an shell with an activated virtualenv worked fine. I think it would be very nice if one could specify the virtual env when creating the FDW server:

CREATE SERVER multicorn_imap FOREIGN DATA WRAPPER multicorn options ( wrapper 'multicorn.imapfdw.ImapFdw', virtualenv '/path/to/my/virtualenv' );

I am not very familiar with the Python C API but I will take a look at the Multicorn C code and try and figure out the lifecycle of the interpreter. For isolation, it seems like it would be good to have one interpreter per FDW SERVER declaration. I wonder how much -- if any -- state hangs around in the Python interpreter between queries. I am also curious as to what happens with concurrent access to the FDW across different sessions.

phrrngtn commented 7 years ago

according to this comment, there is one interpreter per session: https://github.com/Kozea/Multicorn/issues/71#issuecomment-43492744