hideaki-t / sqlite-fts-python

A Python binding of SQLite Full Text Search Tokenizer
MIT License
45 stars 11 forks source link

Segmentation fault #14

Closed teucer closed 4 years ago

teucer commented 5 years ago

Getting Segmentation fault (core dumped) on linux due to the line

hideaki-t commented 5 years ago

Thank you for the report,

because it segv at that line, I think the problem is likely related to the environment, can you provide a few more details about your environment and how to reproduce it?

following information would be helpful

here are info for my local envs.

$ python -V
Python 3.7.3
$ uname -m
x86_64
$ python -c 'import sqlite3;print(sqlite3.sqlite_version)'
3.28.0

$ pypy3 -V
Python 3.6.1 (784b254d669919c872a505b807db8462b6140973, May 09 2019, 13:17:30)
[PyPy 7.1.1-beta0 with GCC 8.3.0]
$ pypy3 -c 'import sqlite3;print(sqlite3.sqlite_version)'
3.28.0
$ python3 -V
Python 3.7.3rc1
$ uname -m
aarch64
$ python3 -c 'import sqlite3;print(sqlite3.sqlite_version)'
3.27.2

also see https://github.com/hideaki-t/sqlite-fts-python/blob/master/.travis.yml

teucer commented 5 years ago
$ python -V
Python 3.6.5 :: Anaconda, Inc.
$ uname -m
x86_64
$ python -c 'import sqlite3;print(sqlite3.sqlite_version)'
3.23.1

I am using apsw.

Example for reproduction:

import apsw
from sqlitefts.tokenizer import dll, get_db_from_connection
from sqlitefts.fts5 import ffi

con = apsw.Connection("data-dev.sqlite")
db = get_db_from_connection(con)
pStmt = ffi.new('sqlite3_stmt**')
rc = dll.sqlite3_prepare_v2(db, b'SELECT fts5(?1)', -1, pStmt,ffi.NULL)
hideaki-t commented 5 years ago

I tried your snippet, it does not cause any error.

can you try python -c 'import apsw;print(apsw.sqlitelibversion(), apsw.compile_options)'

$ ./.tox/py37-apsw-linux/bin/python -c 'import apsw;print(apsw.sqlitelibversion(), apsw.compile_options)'
3.28.0 ('COMPILER=gcc-8.3.0', 'ENABLE_COLUMN_METADATA', 'ENABLE_DBSTAT_VTAB', 'ENABLE_FTS3', 'ENABLE_FTS3_TOKENIZER', 'ENABLE_FTS4', 'ENABLE_FTS5', 'ENABLE_JSON1', 'ENABLE_RTREE', 'ENABLE_UNLOCK_NOTIFY', 'HAVE_ISNAN', 'MAX_EXPR_DEPTH=10000', 'MAX_VARIABLE_NUMBER=250000', 'SECURE_DELETE', 'TEMP_STORE=1', 'THREADSAFE=1')

$ ./.tox/py37-apswa-linux/bin/python -c 'import apsw;print(apsw.sqlitelibversion(), apsw.compile_options)'
3.27.2 ('COMPILER=gcc-8.3.0', 'ENABLE_API_ARMOR', 'ENABLE_FTS3', 'ENABLE_FTS3_PARENTHESIS', 'ENABLE_FTS4', 'ENABLE_FTS5', 'ENABLE_GEOPOLY', 'ENABLE_ICU', 'ENABLE_JSON1', 'ENABLE_RBU', 'ENABLE_RTREE', 'ENABLE_STAT4', 'THREADSAFE=1')
teucer commented 5 years ago
$ python -c 'import apsw;print(apsw.sqlitelibversion(), apsw.compile_options)'
3.28.0 ('COMPILER=gcc-6.2.0', 'ENABLE_API_ARMOR', 'ENABLE_FTS3', 'ENABLE_FTS3_PARENTHESIS', 'ENABLE_FTS4', 'ENABLE_FTS5', 'ENABLE_GEOPOLY', 'ENABLE_ICU', 'ENABLE_JSON1', 'ENABLE_RBU', 'ENABLE_RTREE', 'ENABLE_STAT4', 'THREADSAFE=1')

With sqlite3it works:

import sqlite3
from sqlitefts.tokenizer import dll, get_db_from_connection
from sqlitefts.fts5 import ffi

con = sqlite3.Connection("data-dev.sqlite")
db = get_db_from_connection(con)
pStmt = ffi.new('sqlite3_stmt**')
dll = ffi.dlopen("sqlite3")
rc = dll.sqlite3_prepare_v2(db, b'SELECT fts5(?1)', -1, pStmt, ffi.NULL)

also run below:

con.cursor().execute('pragma compile_options;').fetchall()
[('COMPILER=gcc-7.2.0',), ('ENABLE_COLUMN_METADATA',), ('ENABLE_DBSTAT_VTAB',), ('ENABLE_FTS3',), ('ENABLE_FTS3_TOKENIZER',), ('ENABLE_RTREE',), ('ENABLE_UNLOCK_NOTIFY',), ('MAX_EXPR_DEPTH=10000',), ('MAX_VARIABLE_NUMBER=250000',), ('SECURE_DELETE',), ('THREADSAFE=1',)]
hideaki-t commented 5 years ago

Now I can reproduce the issue, will start debug.

$ conda create --name py365 python==3.6.5
...
$ conda activate py365
$ conda install -c conda-forge apsw
...
$ conda install cffi
...
$ python -V
Python 3.6.5 :: Anaconda, Inc.

apsw package from conda-forge (not Amalgamation) works fine.

# added apsw.apswversion()
$ python -c 'import apsw;print(apsw.apswversion(), apsw.sqlitelibversion(), apsw.compile_options)'
3.25.2-r1 3.28.0 ('COMPILER=gcc-7.3.0', 'ENABLE_COLUMN_METADATA', 'ENABLE_DBSTAT_VTAB', 'ENABLE_FTS3_TOKENIZER', 'ENABLE_FTS4', 'ENABLE_FTS5', 'ENABLE_JSON1', 'ENABLE_RTREE', 'ENABLE_UNLOCK_NOTIFY', 'MAX_EXPR_DEPTH=10000', 'MAX_VARIABLE_NUMBER=250000', 'SECURE_DELETE', 'THREADSAFE=1')

I was able to hit the issue using apsw Amalgamation mode.

$ pip install https://github.com/rogerbinns/apsw/releases/download/3.27.2-r1/apsw-3.27.2-r1.zip --global-option=fetch --global-option=--version=3.27.2 --global-option=--sqlite --global-option=build --global-option=--enable-all-extensions
$ python -c 'import apsw;print(apsw.apswversion(), apsw.sqlitelibversion(), apsw.compile_options)'                                                             3.27.2-r1 3.27.2 ('COMPILER=gcc-8.3.0', 'ENABLE_API_ARMOR', 'ENABLE_FTS3', 'ENABLE_FTS3_PARENTHESIS', 'ENABLE_FTS4', 'ENABLE_FTS5', 'ENABLE_GEOPOLY', 'ENABLE_ICU', 'ENABLE_JSON1', 'ENABLE_RBU', 'ENABLE_RTREE', 'ENABLE_STAT4', 'THREADSAFE=1')
teucer commented 5 years ago

A work around for me was to compile FTS5 as loadable extension and dynamically load it in sqlite3 package.

hideaki-t commented 5 years ago

I think I understood why, but I have no solution now. I'd say this is a limitation, I will put it into README.

hideaki-t commented 5 years ago

Interesting. so once fts5 is loaded via sqlite3, apsw works okay?

teucer commented 5 years ago

I decided to use sqlite3.

On windows I had sqlite3.dll with FTS5 enabled (manually changed it), I could then register the tokenizer with apsw only after having registered with sqlite3...