hannes / MonetDBLite

MonetDB reconfigured as a library
108 stars 11 forks source link

can't initialise database #193

Open gijzelaerr opened 7 years ago

gijzelaerr commented 7 years ago
λ  virtualenv -p python2 venv
Running virtualenv with interpreter /usr/bin/python2
New python executable in /tmp/venv/bin/python2
Also creating executable in /tmp/venv/bin/python
Installing setuptools, pkg_resources, pip, wheel...done.
λ  venv/bin/pip install numpy 
Collecting numpy
  Using cached numpy-1.13.1-cp27-cp27mu-manylinux1_x86_64.whl
Installing collected packages: numpy
Successfully installed numpy-1.13.1
λ  venv/bin/pip install monetdblite
Collecting monetdblite
Installing collected packages: monetdblite
Successfully installed monetdblite-0.1.9
λ  venv/bin/python -c "import monetdblite; monetdblite.init('gijs')"
!GDKenvironment: directory not an absolute path: gijs.
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/tmp/venv/local/lib/python2.7/site-packages/monetdblite/__init__.py", line 40, in init
    return embeddedmonetdb.init(*args, **kwargs)
  File "/tmp/venv/local/lib/python2.7/site-packages/monetdblite/embeddedmonetdb.py", line 80, in init
    raise __throw_exception(str(retval) + ' in ' + str(directory))
  File "/tmp/venv/local/lib/python2.7/site-packages/monetdblite/embeddedmonetdb.py", line 74, in __throw_exception
    raise exceptions.DatabaseError(str.replace('MALException:', ''))
monetdblite.exceptions.DatabaseError: Failed to initialize MonetDB: GDKinit() failed. in gijs
λ  ls -al gijs
total 28
drwxr-xr-x  2 gijs gijs  4096 sep  6 09:51 .
drwxrwxrwt 19 root root 20480 sep  6 09:51 ..
gijzelaerr commented 7 years ago

ah only now I see the absolute path error, oops. Still this is not very user friendly.

gijzelaerr commented 7 years ago

Ok so this actually only fails the first time. So if you init/connect a database with a absolute path first time, the second time you can access DB's with a relative path.

gijzelaerr commented 7 years ago

realize now this is probably related to #193 and I guess you can only operate on one db connection per Python session.

hannes commented 7 years ago

You can have multiple connections but not different db directories at the same time. In the R frontend we catch this, perhaps also a good idea for Python @Mytherin

caewok commented 7 years ago

You mention having multiple connections but not different db directories at the same time. It looks like a single R session can have multiple connections, but different R sessions (or different threads) cannot simultaneously connect to the same embedded database.

It would be useful to allow this for certain situations, such as when using threading (leaving aside the potential for data corruption if a table is modified simultaneously---that could be handled with a table lock upon modification)

Right now, multiple connections work, for example:

library(MonetDBLite)
library(DBI)
dir.create("monetdb_tmp")
dbdir <- "monetdb_tmp"
con <- ml(dbdir)
con2 <- ml(dbdir)
dbWriteTable(con, "mtcars", mtcars)
dbGetQuery(con, "SELECT MAX(mpg) FROM mtcars WHERE cyl = 8")
dbGetQuery(con2, "SELECT MAX(mpg) FROM mtcars WHERE cyl = 8")

But if you start a second R session, say in a new terminal, connecting to the database throws the error:

# in second R session
library(MonetDBLite)
library(DBI)
dbdir <- "monetdb_tmp"
con <- ml(dbdir)
# Error in monetdb_embedded_startup(embedded, !getOption("monetdb.debug.embedded",  : 
# Failed to initialize embedded MonetDB !FATAL: BBPaddfarm: bad rolemask 

A similar thing happens if you attempt multiple connections in separate threads:

library(parallel)
cl <- makeCluster(3)
parLapply(cl = cl, 1:3, function(i, dbdir) { 
library(MonetDBLite)
library(DBI)
new_con <- ml(dbdir)
out <- dbGetQuery(new_con, "SELECT MAX(mpg) FROM mtcars WHERE cyl = 8")
dbDisconnect(new_con)
out
}, dbdir = dbdir)
#   3 nodes produced errors; first error: Failed to initialize embedded MonetDB !FATAL:
# GDKlockHome: Database lock 'monetdb_tmp/.gdk_lock' denied 

Note that the mclapply family functions work (with forking), probably due to the shared memory.

hannes commented 7 years ago

Multiple threads works, multiple processes does not. Its non-trivial to allow multiple process access to the same data directory. I suggest using a stand-alone MonetDB server in those cases.

gijzelaerr commented 7 years ago

I think a good start would be to give a warning or error, it is now not obvious what is happening. It is not a huge problem to have the limitation of one connection for now, as long as it is obvious.

2017-09-09 17:00 GMT+02:00 caewok notifications@github.com:

You mention having multiple connections but not different db directories at the same time. It looks like a single R session can have multiple connections, but different R sessions (or different threads) cannot simultaneously connect to the same embedded database.

It would be useful to allow this for certain situations, such as when using threading (leaving aside the potential for data corruption if a table is modified simultaneously---that could be handled with a table lock upon modification)

Right now, multiple connections work, for example:

library(MonetDBLite) library(DBI) dir.create("monetdb_tmp")dbdir <- "monetdb_tmp"con <- ml(dbdir)con2 <- ml(dbdir) dbWriteTable(con, "mtcars", mtcars) dbGetQuery(con, "SELECT MAX(mpg) FROM mtcars WHERE cyl = 8") dbGetQuery(con2, "SELECT MAX(mpg) FROM mtcars WHERE cyl = 8")

But if you start a second R session, say in a new terminal, connecting to the database throws the error:

in second R session

library(MonetDBLite) library(DBI)dbdir <- "monetdb_tmp"con <- ml(dbdir)# Error in monetdb_embedded_startup(embedded, !getOption("monetdb.debug.embedded", : # Failed to initialize embedded MonetDB !FATAL: BBPaddfarm: bad rolemask

A similar thing happens if you attempt multiple connections in separate threads:

library(parallel)cl <- makeCluster(3) parLapply(cl = cl, 1:3, function(i, dbdir) { library(MonetDBLite) library(DBI)new_con <- ml(dbdir)out <- dbGetQuery(new_con, "SELECT MAX(mpg) FROM mtcars WHERE cyl = 8") dbDisconnect(new_con)out }, dbdir = dbdir)# 3 nodes produced errors; first error: Failed to initialize embedded MonetDB !FATAL:# GDKlockHome: Database lock 'monetdb_tmp/.gdk_lock' denied

Note that the mclapply family functions work (with forking), probably due to the shared memory.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/hannesmuehleisen/MonetDBLite/issues/193#issuecomment-328282501, or mute the thread https://github.com/notifications/unsubscribe-auth/AAT6pMpZ9r_RoQfYuZpKmjL0gyuSb8A5ks5sgqgegaJpZM4PN-ik .

-- Gijs Molenaar http://pythonic.nl