kylemaxxwell / rpostgresql

Automatically exported from code.google.com/p/rpostgresql
0 stars 0 forks source link

Unix domain connection / PQhost handling #20

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. create a postgresql database and user that can only be accessed with local 
unix domain connection.
2. confirm that you can access with psql the_database
3. try to connect from RPostgreSQL but any trial fails.

What is the expected output? What do you see instead?

What version of the product are you using? On what operating system?
postgresql84 on centos 5.5 installed with yum
R-2.11.1
RPostgreSQL_0.1-6.tar.gz from CRAN
(the last patch is against latest svn code that doesn't seem to differ)

Please provide any additional information below.

PQsetdbLogin()
should accept NULL parameter, as well as empty string,
which is treated as default value.

PQhost()
returns NULL for unix domain connection and
the return value requires check before copying.
presumably ok to use empty string "" when the return value is NULL.

references:
http://www.postgresql.org/docs/8.3/static/libpq-connect.html
http://archives.postgresql.org/pgsql-interfaces/2006-04/msg00026.php

Suggested change:

$ diff -u  rpostgresql-read-only/RPostgreSQL/src/RS-PostgreSQL.c 
RPostgreSQL/src/RS-PostgreSQL.c 
--- rpostgresql-read-only/RPostgreSQL/src/RS-PostgreSQL.c       2010-09-09 
16:22:51.000000000 +0900
+++ RPostgreSQL/src/RS-PostgreSQL.c     2010-09-09 16:16:04.000000000 +0900
@@ -1,7 +1,7 @@
 /*
  *    RS-PostgreSQL.c
  *
- * Last Modified: $Date: 2009-10-14 09:16:35 +0900 (Wed, 14 Oct 2009) $
+ * Last Modified: $Date: 2009-10-13 19:16:35 -0500 (Tue, 13 Oct 2009) $
  *
  * This package was developed as a part of Summer of Code program organized by Google.
  * Thanks to David A. James & Saikat DebRoy, the authors of RMySQL package.
@@ -199,27 +199,7 @@
     if (!IS_EMPTY(CHR_EL(con_params, 6))) {
         options = (char *) CHR_EL(con_params, 6);
     }
-    if (user == NULL) {
-        user = "";
-    }
-    if (password == NULL) {
-        password = "";
-    }
-    if (host == NULL) {
-        host = "localhost";
-    }
-    if (port == NULL) {
-        port = "";
-    }
-    if (options == NULL) {
-        options = "";
-    }
-    if (tty == NULL) {
-        tty = "";
-    }
-    if (dbname == NULL) {
-        dbname = "template1";
-    }
+
     my_connection = PQsetdbLogin(host, port, options, tty, dbname, user, password);

     if (PQstatus(my_connection) != CONNECTION_OK) {
@@ -233,7 +213,14 @@
     /* save actual connection parameters */
     conParams->user = RS_DBI_copyString(PQuser(my_connection));
     conParams->password = RS_DBI_copyString(PQpass(my_connection));
-    conParams->host = RS_DBI_copyString(PQhost(my_connection));
+    {
+      const char *tmphost = PQhost(my_connection);
+      if(tmphost){
+        conParams->host = RS_DBI_copyString(tmphost);
+      }else{
+        conParams->host = RS_DBI_copyString("");
+      }
+    }
     conParams->dbname = RS_DBI_copyString(PQdb(my_connection));
     conParams->port = RS_DBI_copyString(PQport(my_connection));
     conParams->tty = RS_DBI_copyString(PQtty(my_connection));

Original issue reported on code.google.com by tomoa...@kenroku.kanazawa-u.ac.jp on 9 Sep 2010 at 7:54

Attachments:

GoogleCodeExporter commented 9 years ago
Can you add an R program that causes rpostgresql to fail prior to modification 
and that passes after modification?

Thank you.
Neil

Original comment by ne...@neiltiffin.com on 9 Sep 2010 at 3:33

GoogleCodeExporter commented 9 years ago
the R program is the simplest one as:
> library("RPostgreSQL")
> m <- dbDriver("PostgreSQL")
> con <- dbConnect(m)

which is expected nearly equivalent to call just "psql" on the shell.
So, I assume your question is how to reproduce the failure to connect.
PostgreSQL have quite a number of ways for its authentication, which is 
regulated by $PGDATA/pg_hba.conf, which distinguish unix domain connection
and IP connection.

The file with default install on CentOS lookslike:

# "local" is for Unix domain socket connections only
local   all         all                               ident
# IPv4 local connections:
host    all         all         127.0.0.1/32          ident
# IPv6 local connections:
host    all         all         ::1/128               ident

after creating role and database which are named same as the user name
psql will connect to the default database which has the same name like

$ psql
psql (8.4.4)
Type "help" for help.

tomoaki=> 

however 
$ psql -h localhost
psql: FATAL:  Ident authentication failed for user "tomoaki"

will fail, presumably because of identd or firewall setup.
But the point is that there is unix domain connection that is different
from ip connection, and in fact we can setup intentionally to deny IP connection
as well.

Current code makes no way to connect via unix domain and
con <- dbConnect(m)
corresponds to "psql -h localhost template1"
instead of "psql"

result before patch (but the patch for issue 21 was applied for compilation)

$ R --no-save < test-connect-psql.r  

R version 2.11.1 (2010-05-31)
Copyright (C) 2010 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library("RPostgreSQL")
Loading required package: DBI
> m <- dbDriver("PostgreSQL")
> con <- dbConnect(m)
Error in postgresqlNewConnection(drv, ...) : 
  RS-DBI driver: (could not connect @localhost on dbname "template1"
)
Calls: dbConnect ... .valueClassTest -> is -> is -> postgresqlNewConnection -> 
.Call
Execution halted

result after patch
$ R --no-save < test-connect-psql.r  

R version 2.11.1 (2010-05-31)
Copyright (C) 2010 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library("RPostgreSQL")
Loading required package: DBI
> m <- dbDriver("PostgreSQL")
> con <- dbConnect(m)
> 

During writing this, I additionally found that the
error message can be improved and updated the patch (attached as diff2)
I also attach a failing script with con <- dbConnect(m, host="localhost")
The new error message reads
Error in postgresqlNewConnection(drv, ...) : 
  RS-DBI driver: (could not connect tomoaki@localhost on dbname "tomoaki"
)

The point is that even if the connection fails, PQuser() and PQdb() will
return the default user and database name which was not supplied.
Also we should avoid passing NULL pointer to printf function although
some system works fine, but some may segfault (for host).

A small test program to understand the behavior of libpq is also attached.

Original comment by tomoa...@kenroku.kanazawa-u.ac.jp on 10 Sep 2010 at 1:28

Attachments:

GoogleCodeExporter commented 9 years ago
That looks like another great patch by bringing the Postgresql-specific code in 
whereas Sameer initially just copied what was used on RMySQL to set up 
connection strings.

Original comment by dirk.eddelbuettel on 11 Sep 2010 at 2:04

GoogleCodeExporter commented 9 years ago
One comment though where I disagree. You wrote

  however 
  $ psql -h localhost
  psql: FATAL:  Ident authentication failed for user "tomoaki"

  will fail, presumably because of identd or firewall setup.

AFAIK this has always used tcp/ip connections and we need to support those too. 
PostgreSQL is not SQLLite -- we can work across machines.

  But the point is that there is unix domain connection that is different
  from ip connection, and in fact we can setup intentionally to deny IP connection
  as well.

No. I want them to work by default. "Setup intentionally" is very wrong.

Having applied your 'diff2' patch above, I now a get seg.fault from 
tests/selectWhereZero.R which would be a regression:

$ ./check_with_vars.sh    # simple script that loops over tests setting env vars
[...]
==== Running RPostgreSQL/tests/selectWhereZero.R
Loading required package: RPostgreSQL
Loading required package: DBI
[1] "Removing tmpirisdata\n"
*** buffer overflow detected ***: /usr/lib64/R/bin/exec/R terminated

Could you take a look if this is related to the patch?

Thanks, Dirk

Original comment by dirk.eddelbuettel on 11 Sep 2010 at 3:26

GoogleCodeExporter commented 9 years ago
I added the test script 'check_with_vars.sh' to SVN.

Original comment by dirk.eddelbuettel on 11 Sep 2010 at 4:31

GoogleCodeExporter commented 9 years ago
I didn't get that segfault.

What I got with the today svn version is:

==== Running RPostgreSQL/tests/selectWhereZero.R
Loading required package: RPostgreSQL
Loading required package: DBI
Error in postgresqlExecStatement(conn, statement, ...) : 
  RS-DBI driver: (could not Retrieve the result : ERROR:  must be superuser to COPY to or from a file
HINT:  Anyone can COPY to stdout or from stdin. psql's \copy command also works 
for anyone.
)
Error in postgresqlExecStatement(conn, statement, ...) : 
  RS-DBI driver: (could not Retrieve the result : ERROR:  operator does not exist: text = integer
LINE 1: select * from tmpirisdata where Species=0
                                               ^
HINT:  No operator matches the given name and argument type(s). You might need 
to add explicit type casts.
)
In addition: Warning message:
In postgresqlWriteTable(conn, name, value, ...) :
  could not load data into table
NULL
[1] "Removing tmpirisdata\n"
[1] TRUE
Warning message:
In postgresqlQuickSQL(conn, statement, ...) :
  Could not create executeselect * from tmpirisdata where Species=0

What I got with the previous local version (which has patched for dbWriteTable 
as posted to issue 13)

==== Running RPostgreSQL/tests/selectWhereZero.R
Loading required package: RPostgreSQL
Loading required package: DBI
Error in postgresqlExecStatement(conn, statement, ...) : 
  RS-DBI driver: (could not Retrieve the result : ERROR:  operator does not exist: text = integer
LINE 1: select * from tmpirisdata where Species=0
                                               ^
HINT:  No operator matches the given name and argument type(s). You might need 
to add explicit type casts.
)
NULL
[1] "Removing tmpirisdata\n"
[1] TRUE
Warning message:
In postgresqlQuickSQL(conn, statement, ...) :
  Could not create executeselect * from tmpirisdata where Species=0

This is somewhat better than the latest svn version in that the data
seems created properly.

If I further change the query in selectWhereZero.R
to Sepcies='0', I got:

==== Running RPostgreSQL/tests/selectWhereZero.R
Loading required package: RPostgreSQL
Loading required package: DBI
data frame with 0 columns and 0 rows
[1] "Removing tmpirisdata\n"
[1] TRUE

----------------------------

The above test was run with
 export POSTGRES_USER="tomoaki"
 export POSTGRES_PASSWD=""
 export POSTGRES_HOST="localhost"
 export POSTGRES_DATABASE="tomoaki"
 export POSTGRES_PORT="5432"
on a really single user computer having a line
host    all         all         127.0.0.1/32          trust
in the pg_hba.conf
Note, this should not be done on a multi-user system.
Since this is working and
host    all         all         127.0.0.1/32          ident
does not allow access, I presumed its identd or firewall problem
but not RPostgreSQL problem.

-----------------------------
  AFAIK this has always used tcp/ip connections and we need to support those too. PostgreSQL is not SQLLite -- we can work across machines.

    But the point is that there is unix domain connection that is different
    from ip connection, and in fact we can setup intentionally to deny IP connection
    as well.

  No. I want them to work by default. "Setup intentionally" is very wrong.

I do agree that it working by default is desirable.  However it is 
a matter of PostgreSQL and PostgreSQL packaging by the distributions.
For the part of connector of R and PostgreSQL it is best to work as psql
does.  If psql connect, RPostgreSQL does connect.

In a general network where there are many user and hosts, the authentication
is not a simple task, and I don't blame if distribution default is restrictive
and the administrator needs to setup the allowed way of connection under their
policy.

Original comment by tomoa...@kenroku.kanazawa-u.ac.jp on 12 Sep 2010 at 1:44

Attachments:

GoogleCodeExporter commented 9 years ago
Hi

Thanks for the follow-up!  I agree that something odd is going on with the 
segfault. I can trigger it one way but not the other.

I still disagree on the socket vs tcp-ip access isssue, but agree that when 
psql can connect (using -h somehost), RPostgreSQL should too.

But would you mind posting the discussion on our mailing list
(see http://groups.google.com/group/rpostgresql-dev)
instead?  I find it awfully hard to quote what you wrote to discuss some points 
in 
detail.

That said, I am *very very happy* that you are contributing so actively.

Bye, Dirk

Original comment by dirk.eddelbuettel on 12 Sep 2010 at 3:09

GoogleCodeExporter commented 9 years ago
I can confirm the problem with selectWhereZero.R on mac os x with svn -r140

-------------- versions                                      
----------------------
ProductName:    Mac OS X
ProductVersion: 10.6.4
BuildVersion:   10F569
RPostgreSQL svn version: 140
psql (PostgreSQL) 9.0beta3

R version 2.11.1 (2010-05-31) 
x86_64-apple-darwin9.8.0 

locale:

Package: RPostgreSQL
Version: 0.1-7
Packaged: NA
Built: R 2.11.1; universal-apple-darwin9.8.0; 2010-09-12 12:55:05 UTC;

--------------start RPostgreSQL/tests/selectWhereZero.R      
----------------------
Loading required package: RPostgreSQL
Loading required package: DBI
./check_with_vars_neil.sh: line 74: 18095 Abort trap              R --arch=i386 
--slave < RPostgreSQL/tests/selectWhereZero.R

Original comment by ne...@neiltiffin.com on 12 Sep 2010 at 12:58

GoogleCodeExporter commented 9 years ago
segfault with selectWhereZero.R on Mac OS X was reported as issue 19 on 24 Aug.
Do you find any differene with the recent changes? 
If not, I guess its different problem that is independent of dbConnect() change.

Original comment by tomoa...@kenroku.kanazawa-u.ac.jp on 12 Sep 2010 at 1:26

GoogleCodeExporter commented 9 years ago
Reassigned.

Original comment by ne...@neiltiffin.com on 13 Oct 2010 at 2:34

GoogleCodeExporter commented 9 years ago
This is considered to have been fixed at r138, though some more minor cleanups 
of the code may be possible.

Original comment by tomoa...@kenroku.kanazawa-u.ac.jp on 14 Oct 2010 at 2:37