ContinuumIO / anaconda-issues

Anaconda issue tracking
646 stars 220 forks source link

SQLite load_extension not loading Spatialite Extension #10926

Open tacree-odot opened 5 years ago

tacree-odot commented 5 years ago

Actual Behavior

When attempting to load the spatialite extension, I am receiving the error:

sqlite3.OperationalError: The specified module could not be found.

I have tried loading this extension multiple ways, and none of them seem to work. See errors below in the output.

Expected Behavior

The extension should load. Note that I have verified that the spatialite dll is in the environment, I am printing it in the script below.

Steps to Reproduce

I have produced the following script to demonstrate the issue below:

import sqlite3
import sys
from ctypes import util, cdll

# Verify some of the loaded paths and libraries:
lib = util.find_library('spatialite')

print('loaded spatialite library: ', lib)

for p in sys.path:
    print(p)

SQL = """
        select AsWKT(ST_LineFromText('LINESTRING(540000 220000, 540000 221609)', 32123))
        """
connection = sqlite3.Connection(':memory:')
connection.enable_load_extension(True)
# connection.execute("SELECT load_extension('mod_spatialite')")

try:
    connection.load_extension("mod_spatialite")
except:
    print("load_extension(mod_spatialite) failed, Error: ", sys.exc_info()[1], "occured.")
    print('\n\nTry loading using the name of the dll (spatialite.dll)')
    connection.load_extension("spatialite")

with connection as conn:
    cur = conn.execute(SQL)
    results = cur.fetchall()

    for result in results:
        wkt = result[0]
        print(wkt)

Script output:

loaded spatialite library: ...\Continuum\anaconda3\envs\geodb37\Library\bin\spatialite.dll ...\python\conda-test\src ...\python\conda-test ...\python\conda-test\src ...\Continuum\anaconda3\envs\geodb37\python37.zip ...\Continuum\anaconda3\envs\geodb37\DLLs ...\Continuum\anaconda3\envs\geodb37\lib ...\Continuum\anaconda3\envs\geodb37 ...\Continuum\anaconda3\envs\geodb37\lib\site-packages

load_extension(mod_spatialite) failed, Error: The specified module could not be found. occured.

Try loading using the name of the dll (spatialite.dll) Traceback (most recent call last): File "...python/conda-test/src/test_spatialite.py", line 21, in connection.load_extension("mod_spatialite") sqlite3.OperationalError: The specified module could not be found.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File ".../python/conda-test/src/test_spatialite.py", line 25, in connection.load_extension("spatialite") sqlite3.OperationalError: The specified procedure could not be found.

Anaconda or Miniconda version:

Anaconda3-2019.03-Windows-x86_64

Operating System:

Windows 10

conda info
``` active environment : geodb37 active env location : C:\Users\tacree1\AppData\Local\Continuum\anaconda3\envs\geodb37 shell level : 2 user config file : C:\Users\tacree1\.condarc populated config files : C:\Users\tacree1\.condarc conda version : 4.6.14 conda-build version : 3.17.8 python version : 3.7.3.final.0 base environment : C:\Users\tacree1\AppData\Local\Continuum\anaconda3 (writable) channel URLs : https://repo.anaconda.com/pkgs/main/win-64 https://repo.anaconda.com/pkgs/main/noarch https://repo.anaconda.com/pkgs/free/win-64 https://repo.anaconda.com/pkgs/free/noarch https://repo.anaconda.com/pkgs/r/win-64 https://repo.anaconda.com/pkgs/r/noarch https://repo.anaconda.com/pkgs/msys2/win-64 https://repo.anaconda.com/pkgs/msys2/noarch package cache : C:\Users\tacree1\AppData\Local\Continuum\anaconda3\pkgs C:\Users\tacree1\.conda\pkgs C:\Users\tacree1\AppData\Local\conda\conda\pkgs envs directories : C:\Users\tacree1\AppData\Local\Continuum\anaconda3\envs C:\Users\tacree1\.conda\envs C:\Users\tacree1\AppData\Local\conda\conda\envs platform : win-64 user-agent : conda/4.6.14 requests/2.21.0 CPython/3.7.3 Windows/10 Windows/10.0.16299 administrator : False netrc file : None offline mode : False ```
conda list --show-channel-urls
``` # packages in environment at ...\Local\Continuum\anaconda3\envs\geodb37: # # Name Version Build Channel attrs 19.1.0 py37_1 defaults blas 1.0 mkl defaults bzip2 1.0.6 hfa6e2cd_5 defaults ca-certificates 2019.1.23 0 defaults certifi 2019.3.9 py37_0 defaults click 7.0 py37_0 defaults click-plugins 1.1.1 py_0 defaults cligj 0.5.0 py37_0 defaults curl 7.64.1 h2a8f88b_0 defaults cycler 0.10.0 py37_0 defaults descartes 1.1.0 py37_0 defaults expat 2.2.5 he025d50_0 defaults fiona 1.8.4 py37h22081e2_0 defaults freetype 2.9.1 ha9979f8_1 defaults freexl 1.0.5 hfa6e2cd_0 defaults gdal 2.3.3 py37hdf43c64_0 defaults geopandas 0.4.1 py_0 defaults geos 3.7.1 h33f27b4_0 defaults hdf4 4.2.13 h712560f_2 defaults hdf5 1.10.4 h7ebc959_0 defaults icc_rt 2019.0.0 h0cc432a_1 defaults icu 58.2 ha66f8fd_1 defaults intel-openmp 2019.3 203 defaults jpeg 9b hb83a4c4_2 defaults kealib 1.4.7 h07cbb95_6 defaults kiwisolver 1.1.0 py37ha925a31_0 defaults krb5 1.16.1 hc04afaa_7 defaults libboost 1.67.0 hd9e427e_4 defaults libcurl 7.64.1 h2a8f88b_0 defaults libgdal 2.3.3 h10f50ba_0 defaults libiconv 1.15 h1df5818_7 defaults libkml 1.3.0 he5f2a48_4 defaults libnetcdf 4.6.1 h411e497_2 defaults libpng 1.6.37 h2a8f88b_0 defaults libpq 11.2 h3235a2c_0 defaults libspatialindex 1.8.5 h6538335_2 defaults libspatialite 4.3.0a hc36aec2_19 defaults libssh2 1.8.2 h7a1dbc1_0 defaults libtiff 4.0.10 hb898794_2 defaults libxml2 2.9.9 h464c3ec_0 defaults mapclassify 2.0.1 py_0 defaults matplotlib 3.0.3 py37hc8f65d3_0 defaults mkl 2019.3 203 defaults mkl_fft 1.0.12 py37h14836fe_0 defaults mkl_random 1.0.2 py37h343c172_0 defaults munch 2.3.2 py37_0 defaults numpy 1.16.3 py37h19fb1c0_0 defaults numpy-base 1.16.3 py37hc3f5095_0 defaults openssl 1.1.1b he774522_1 defaults pandas 0.24.2 py37ha925a31_0 defaults pcre 8.43 ha925a31_0 defaults pip 19.1.1 py37_0 defaults proj4 5.2.0 ha925a31_1 defaults psycopg2 2.7.6.1 py37h7a1dbc1_0 defaults pyparsing 2.4.0 py_0 defaults pyproj 1.9.6 py37h6782396_0 defaults pyqt 5.9.2 py37h6538335_2 defaults python 3.7.3 h8c8aaf0_1 defaults python-dateutil 2.8.0 py37_0 defaults pytz 2019.1 py_0 defaults qt 5.9.7 vc14h73c81de_0 defaults rtree 0.8.3 py37_0 defaults scipy 1.2.1 py37h29ff71c_0 defaults setuptools 41.0.1 py37_0 defaults shapely 1.6.4 py37h222a598_0 defaults sip 4.19.8 py37h6538335_0 defaults six 1.12.0 py37_0 defaults sqlalchemy 1.3.3 py37he774522_0 defaults sqlite 3.28.0 he774522_0 defaults tk 8.6.8 hfa6e2cd_0 defaults tornado 6.0.2 py37he774522_0 defaults vc 14.1 h0510ff6_4 defaults vs2015_runtime 14.15.26706 h3a45250_4 defaults wheel 0.33.4 py37_0 defaults wincertstore 0.2 py37_0 defaults xerces-c 3.2.2 ha925a31_0 defaults xz 5.2.4 h2fa13f4_4 defaults zlib 1.2.11 h62dcd97_3 defaults zstd 1.3.7 h508b16e_0 defaults ```
eigenjohnson commented 4 years ago

I have this same problem - is there any progress?

BigBaaadBob commented 4 years ago

Same problem here. I've tried both the anaconda and the conda-forge channels for libspatialite and get the exact same result. Here's the result from conda create -n test libspatialite python using python 3.8.5.

Test code:

from ctypes import util
lib = util.find_library('spatialite')
print('spatialite library: ', lib)

import sys
for p in sys.path:
    print(p)

import sqlite3

connection = sqlite3.connect('geotest')
connection.enable_load_extension(True)

connection.load_extension('spatialite.dll')

And the results:

spatialite library:  C:\Users\witr\miniconda3\envs\test\Library\bin\spatialite.dll
C:\Users\witr\Documents\Python Scripts\geo
C:\Users\witr\miniconda3\envs\test\python38.zip
C:\Users\witr\miniconda3\envs\test\DLLs
C:\Users\witr\miniconda3\envs\test\lib
C:\Users\witr\miniconda3\envs\test
C:\Users\witr\miniconda3\envs\test\lib\site-packages
Traceback (most recent call last):
  File "geo.py", line 16, in <module>
    connection.load_extension('spatialite.dll')
sqlite3.OperationalError: The specified procedure could not be found.
bradh commented 4 years ago

This looks like the sqlite is possibly build without extension loading. Does it work with other extensions?

BigBaaadBob commented 4 years ago

@bradh , As far as I can tell, extension loading is enabled in current conda builds and it certainly seems to be trying to load the spatialite extension. I found anaconda bug reports indicating that it wasn't enabled in the past and that was fixed. I don't know of a simple extension to load (I've looked some, but this is sort of out of my area of knowledge) but if you can suggest one, I'll try it.

For others: this is being discussed in more detail on the spatialite google group.

PMeira commented 4 years ago

I found anaconda bug reports indicating that it wasn't enabled in the past and that was fixed. I don't know of a simple extension to load (I've looked some, but this is sort of out of my area of knowledge) but if you can suggest one, I'll try it.

Good to know, I've been replacing their sqlite3.dll for a while. I did test with the conda DLL now and indeed it's capable of loading extensions. :)

Trying to reproduce @BigBaaadBob's comment with conda create -n test libspatialite python, indeed it doesn't work. The spatialite.dll installed doesn't contain sqlite3_modspatialite_init. That's probably the issue -- this is probably not the extension module version of Spatialite. (see edit)

If I use my own build of Spatialite (more precisely of mod_spatialite.dll), it works as expected.

EDIT: Interesting... The old official build doesn't work either.

My build uses MSVC, I imagine there could be DLL version conflicts with the old builds.

PMeira commented 4 years ago

After playing with this, I think there are multiple issues are play.

  1. From the SQLite docs:

If your shared library ends up being named "YourCode.so" or "YourCode.dll" or "YourCode.dylib" as shown in the compiler examples above, then the correct entry point name would be "sqlite3_yourcode_init".

So sqlite3_spatialite_init or sqlite3_modspatialite_init is indeed missing, would explain why it doesn't load on Windows. On Linux, the extension can be loaded from mod_spatialite.so (containing sqlite3_modspatialite_init), which is installed by the conda package along with libspatialite.so.

  1. Depending on the Python version, the R-tree extension is not built in the Anaconda sqlite3.dll. It seems that at least Python 3.6 has this issue. I think the official CPython releases also don't include that.

  2. Old sqlite3.dll from Anaconda (and I believe also from the official CPython distributions) wasn't built with extension support, so that's the original reason why nothing worked before.

  3. There could be issues with old DLLs and Anaconda. IIRC, conda uses some custom code to set search paths, but I'm not sure what's the current situation.

BigBaaadBob commented 4 years ago

Thanks @PMeira for digging in to this. Comments/questions:

  1. I'd like to drive this to some closure so I want to put a bug report somewhere. I don't quite know where to do that. Conda-forge or the feedstocks thingy? Thoughts?
  2. From reading the spatialite documentation, the mod_spatialite DLL is the only one that will work for dynamic loading and it isn't in the libspatialite conda package. That should be part of the bug report.
  3. Is there some way of using the existing libspatialite DLL without dynamically loading it that might have been intended? Maybe combining the sqlite3.dll and the libspatialite.dll somehow?
  4. Finally, I've also found some 32/64 bit issues with the spatialite libraries that I've reported on the google group mentioned above. That might be what Paulo means above regarding MSVC.

I'm totally new at conda and spatialite so I'm learning as I go along...

PMeira commented 4 years ago
  1. I'd like to drive this to some closure so I want to put a bug report somewhere. I don't quite know where to do that. Conda-forge or the feedstocks thingy? Thoughts?

@BigBaaadBob Conda-forge is a separate thing, it's a community project, this issue ticket is about the official Anaconda distribution. About the separate conda-forge version, it's also broken on Windows, its feedstock is https://github.com/conda-forge/libspatialite-feedstock and I found a ticket on the same issue already: https://github.com/conda-forge/libspatialite-feedstock/issues/49 -- maybe we should try to help there since we haven't got any replies from the official Anaconda team.

I think this issue is the right place to report the problem about the official package, as the README states:

Please submit Anaconda issues to the issue tracker of this repository.

Note: This issue tracker is for issues with the Anaconda Python distribution, its installers, and its packages. For issues with the Conda package manager unrelated to any specific package, please use the Conda issue tracker.

This is an issue with an official Anaconda package on Windows (Linux is fine, didn't test on macOS). I'm not aware of public feedstock repos for the official packages, but I could be just ignorant in this case. EDIT: Found the feedstock for the official package: https://github.com/AnacondaRecipes/libspatialite-feedstock -- it's forked from the conda-forge, probably easier to work on fixing the conda-forge version.

Of course, since I'm not a paying Anaconda customer, I wouldn't blame them for giving this low priority. There are 1.7k open issues in repo, probably not a great signal-to-noise ratio. It's a shame the package has been broken for so long though.

  1. From reading the spatialite documentation, the mod_spatialite DLL is the only one that will work for dynamic loading and it isn't in the libspatialite conda package. That should be part of the bug report.

Yep. I'll include some more details below about the package.

Usually the conda-build recipe is included in the final package. The current test included is just this (the conda-forge version is very similar):

import os
import sys
import ctypes
import platform

if sys.platform == 'win32':
    libfreexl = ctypes.CDLL('spatialite.dll')
elif sys.platform == 'darwin':
    # LD_LIBRARY_PATH not set on OS X or Linux.
    path = os.path.expandvars('$PREFIX/lib/libspatialite.dylib')
    libfreexl = ctypes.CDLL(path)
elif sys.platform.startswith('linux'):
    path = os.path.expandvars('$PREFIX/lib/libspatialite.so')
    libfreexl = ctypes.CDLL(path)
else:
    raise Exception('Cannot recognize platform {!r}'.format(sys.platform))

I suggest adding a test for actually loading the extension:

import sqlite3
db = sqlite3.connect(':memory:')
db.enable_load_extension(True)
db.execute("select load_extension('mod_spatialite')")

Looking further for why the mod_spatialite.dll is missing, it looks like the package just runs the makefile.vc from the official with a tiny patch. That Makefile doesn't build mod_spatialite, so there we go, that's the root cause.

(Sidenote: For my version I use custom CMake scripts, nowadays coupled with Conan. Maybe I should update and upload them somewhere for Spatialite 5.0. Right now I remove the parts that depend on GPLed libraries.)

As a reminder, this might be a good time to fix it since the new Spatialite 5.0 is in the RC phase. The build scripts for 5.0 seem to work better with MSVC. There are also new, different dependencies from Spatialite 4.3.0a.

  1. Is there some way of using the existing libspatialite DLL without dynamically loading it that might have been intended? Maybe combining the sqlite3.dll and the libspatialite.dll somehow?

I don't think so, the mod_spatialite version has extra symbols. Maybe older Spatialite versions were not extensions, maybe those ones could be used in place of the SQLite3 DLL. Not worth the trouble nowadays, better to fix the package.

4. Finally, I've also found some 32/64 bit issues with the spatialite libraries that I've reported on the google group mentioned above. That might be what Paulo means above regarding MSVC.

The Spatialite 5.0 RC1 binaries seem to work it we put them in a folder listed in the PATH environment variable. You could also put them in the conda environment (subfolder Library/bin) to emulate what a correct package should do. Long term, it's better to fix the package since Spatialite depend on a lot of DLLs, and it's quite likely that some of the DLLs will have conflicts with the ones from the conda installation.

For other versions, there might also be a conflict of the runtime libraries of MSVC and/or GCC, or missing symbols in the sqlite3.dll like I mentioned in my other post, etc.

What I meant by this:

conda uses some custom code to set search paths, but I'm not sure what's the current situation.

...is that, by default, Windows will allow loading the DLLs as long as they are in the folder. For example, if I downloaded Spatialite 5.0 RC1 and extract that in c:\spatialite, running db.execute(r"select load_extension('c:\spatialite\mod_spatialite.dll')" used to work. I'm not sure if the change was in Anaconda itself or Python 3.8 in general, but now the DLLs need to be in the path (it seems either one of those DLL folders inside the conda environment or the general executable path). Even though the required DLLs are side-by-side in c:\spatialite, they won't be loaded if they're not in the allowed paths, leading to The specified module could not be found.

I'm totally new at conda and spatialite so I'm learning as I go along...

Please let me know if I could clarify anything.