openvenues / jpostal

Java/JNI bindings to libpostal for for fast international street address parsing/normalization
MIT License
105 stars 42 forks source link

tomcat 7, failed libpostal_setup #5

Closed krustyfur closed 7 years ago

krustyfur commented 8 years ago

During the libpostal install, I set datadir to /var/lib. After the gradle build for jpostal I copied everything from src/main/jniLibs/lib to /usr/local/lib/jpostal.

In tomcat.conf, the java.library.path is set to /usr/local/lib/libpostal:/usr/local/lib/jpostal.

Upon trying to parse an address, tomcat crashes with the following in catalina.log :

^[[31mERR^[[39m   Error loading transliteration module
  ^[[90m at libpostal_setup (libpostal.c:1057) ^[[94merrno: No such file or directory^[[39m

I poked around through the source of libpostal and traced that back to loading the transliteration.dat file, which is located at /var/lib/libpostal/transliteration/transliteration.dat.

What do I need to add to the tomcat configuration so that it finds the datafile?

albarrentine commented 8 years ago

Hm, datadir is only relevant to the C installation. I just pushed a change to libpostal master that logs the value of LIBPOSTAL_DATA_DIR when a setup error occurs. Can you pull latest, recompile, and check that it has the correct value? I've only ever encountered that kind of error after changing datadir/moving things around and forgetting to recompile, so the recompile alone may fix it.

krustyfur commented 8 years ago
ERR   Error loading transliteration module, LIBPOSTAL_DATA_DIR=/var/lib/libpostal
   at libpostal_setup (libpostal.c:1057) errno: No such file or directory

transliteration.dat actually lives in a subdirectory from there : /var/lib/libpostal/transliteration/transliteration.dat

Can the full path to it be specified by a JVM option?

albarrentine commented 8 years ago

No actually that looks right. LIBPOSTAL_DATA_DIR is a base dir and then libpostal appends the relative paths from there. That's all in the C lib and done at compile time, so JVM options wouldn't affect it.

Do the libpostal command-line utilities work? (./src/address_parser)

krustyfur commented 8 years ago

No, exact same error message :/

Here's how the directories look, it doesn't appear to be a permissions issues as everything is world readable all the way in.

# ls -lR /var/lib/libpostal/
/var/lib/libpostal/:
total 40
drwxr-xr-x 2 root root 4096 Mar  2 21:06 address_expansions
drwxr-xr-x 2 root root 4096 Dec 12  2015 address_parser
drwxr-xr-x 2 root root 4096 Oct 11  2015 geodb
drwxr-xr-x 2 root root 4096 Jan 26 23:04 language_classifier
-rw-r--r-- 1 root root   29 Jun 30 22:18 last_updated
-rw-r--r-- 1 root root   29 Jun 30 22:19 last_updated_geo
-rw-r--r-- 1 root root   29 Jun 30 22:20 last_updated_language_classifier
-rw-r--r-- 1 root root   29 Jun 30 22:19 last_updated_parser
drwxr-xr-x 2 root root 4096 Mar  2 21:06 numex
drwxr-xr-x 2 root root 4096 Mar  2 21:06 transliteration

/var/lib/libpostal/address_expansions:
total 8996
-rw-r--r-- 1 root root 9210019 Mar 30 20:04 address_dictionary.dat

/var/lib/libpostal/address_parser:
total 415160
-rw-r--r-- 1 root root 396088240 Dec 12  2015 address_parser.dat
-rw-r--r-- 1 root root   5355329 Dec 12  2015 address_parser_phrases.trie
-rw-r--r-- 1 root root  23667987 Dec 12  2015 address_parser_vocab.trie

/var/lib/libpostal/geodb:
total 1069140
-rw-r--r-- 1 root root 138931289 Oct 11  2015 geodb_feature_graph.dat
-rw-r--r-- 1 root root 376008676 Oct 11  2015 geodb_features.trie
-rw-r--r-- 1 root root 125118316 Oct 11  2015 geodb_names.trie
-rw-r--r-- 1 root root   6470006 Oct 11  2015 geodb_postal_codes.dat
-rw-r--r-- 1 root root  48235352 Oct 11  2015 geodb.spi
-rw-r--r-- 1 root root 400020329 Oct 11  2015 geodb.spl

/var/lib/libpostal/language_classifier:
total 693968
-rw-r--r-- 1 root root 710617534 Jan 26 18:30 language_classifier.dat

/var/lib/libpostal/numex:
total 344
-rw-r--r-- 1 root root 351133 Mar 29 15:38 numex.dat

/var/lib/libpostal/transliteration:
total 18520
-rw-r--r-- 1 root root 18961260 Jan 26 05:13 transliteration.dat

Initially (before I even got to tomcat), the permissions looked like this :

drwxr-xr-x   8 root root  4096 Jul  1 16:18 .
drwxr-xr-x. 26 root root  4096 Jul  1 16:16 ..
drwxr-xr-x   2 1000  1000 4096 Mar  2 21:06 address_expansions
drwxr-xr-x   2 2002  1001 4096 Dec 12  2015 address_parser
drwxr-xr-x   2  501 games 4096 Oct 11  2015 geodb
drwxr-xr-x   2 2002  1001 4096 Jan 26 23:04 language_classifier
-rw-r--r--   1 root root    28 Jul  1 16:16 last_updated
-rw-r--r--   1 root root    28 Jul  1 16:16 last_updated_geo
-rw-r--r--   1 root root    28 Jul  1 16:18 last_updated_language_classifier
-rw-r--r--   1 root root    28 Jul  1 16:17 last_updated_parser
drwxr-xr-x   2 1000  1000 4096 Mar  2 21:06 numex
drwxr-xr-x   2 1000  1000 4096 Mar  2 21:06 transliteration

which I fixed with chown -R root: /var/lib/libpostal

# ls -lR /usr/local/lib
/usr/local/lib:
total 17124
drwxr-xr-x 2 root root    4096 Jul  1 17:33 jpostal
-rw-r--r-- 1 root root 9364694 Jul  1 20:20 libpostal.a
-rwxr-xr-x 1 root root     973 Jul  1 20:20 libpostal.la
lrwxrwxrwx 1 root root      18 Jul  1 20:20 libpostal.so -> libpostal.so.0.0.0
lrwxrwxrwx 1 root root      18 Jul  1 20:20 libpostal.so.0 -> libpostal.so.0.0.0
-rwxr-xr-x 1 root root 8154285 Jul  1 20:20 libpostal.so.0.0.0
drwxr-xr-x 2 root root    4096 Jul  1 20:20 pkgconfig

/usr/local/lib/jpostal:
total 312
-rw-r--r-- 1 root root 45192 Jul  1 17:33 libjpostal_expander.a
-rwxr-xr-x 1 root root  1101 Jul  1 17:33 libjpostal_expander.la
-rwxr-xr-x 1 root root 37270 Jul  1 17:33 libjpostal_expander.so
-rwxr-xr-x 1 root root 37270 Jul  1 17:33 libjpostal_expander.so.0
-rwxr-xr-x 1 root root 37270 Jul  1 17:33 libjpostal_expander.so.0.0.0
-rw-r--r-- 1 root root 38452 Jul  1 17:33 libjpostal_parser.a
-rwxr-xr-x 1 root root  1087 Jul  1 17:33 libjpostal_parser.la
-rwxr-xr-x 1 root root 32632 Jul  1 17:33 libjpostal_parser.so
-rwxr-xr-x 1 root root 32632 Jul  1 17:33 libjpostal_parser.so.0
-rwxr-xr-x 1 root root 32632 Jul  1 17:33 libjpostal_parser.so.0.0.0

/usr/local/lib/pkgconfig:
total 4
-rw-r--r-- 1 root root 296 Jul  1 20:20 libpostal.pc
albarrentine commented 8 years ago

Ok, that looks fine and the files are the right size. I would focus on getting libpostal working command-line first, and then try jpostal.

Try installing locally as a non-root user with a datadir other than /var to see if it works:

# Assuming home dir has enough space, otherwise anywhere else owned by your user
cd ~
# If libpostal exists already in this dir, remove it
rm -rf libpostal
git clone https://github.com/openvenues/libpostal
cd libpostal
./bootstrap.sh
./configure LDFLAGS=-L/usr/lib64 --datadir=$(pwd)/data --prefix=$(realpath $(pwd)) --bindir=$(realpath $(pwd)/bin)
make install
./src/address_parser
krustyfur commented 8 years ago

Bugger. That works :(

It also works when run as root (sudo -i). However the data dir still has messed up permissions (see prev post) when run as root but not when built as a standard user.

selinux is disabled, that shouldn't be interfering with file permissions.

I have to install libpostal outside of a user directory because of tomcat's lack of permission to get into user directories.

albarrentine commented 7 years ago

Cleaning up issues, closing if there are no objections.