openvenues / libpostal

A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data.
MIT License
4.03k stars 417 forks source link

Error loading transliteration module, dir=(null) at libpostal_setup_datadir (libpostal.c:266) errno:No such file or directory #365

Open Goof2018 opened 6 years ago

Goof2018 commented 6 years ago

Hello together,

I tried to build libpostal on Windows with msys2.

Installation (Windows)


For Windows the build procedure currently requires MSys2 and MinGW. This can be downloaded from Please follow the instructions on the MSys2 website for installation.

Please ensure Msys2 is up-to-date by running:

pacman -Syu

Install the following prerequisites:

_pacman -S autoconf automake curl git make libtool gcc mingw-w64-x8664-gcc

Then to build the C library:

_git clone cd libpostal *cp -rf windows/ ./ ./bootstrap.sh_**

**_./configure --datadir=$DATA_DIR/home/User/libpostal/data --disable-data-download

./src/libpostal_data download all $DATA_DIR/home/User/libpostal/data make -j4 make install**

"C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.14.26428\bin\Hostx64\x64\lib.exe" /def:libpostal.def /out:libpostal.lib /machine:x64_

But I get the error: C:\msys64\home\User\libpostal\src\libpostal.exe "Quatre vingt douze Ave des Champs-Élysées" ERR Error loading transliteration module, dir=(null) at libpostal_setup_datadir (libpostal.c:266) errno:No such file or directory

Files: C:\msys64\home\User\libpostal\data

last_updated last_updated_language_classifier


1 Datei(en), 8.737.530 Bytes


address_parser_phrases.dat address_parser_postal_codes.dat

4 Datei(en), 1.872.722.323 Bytes


1 Datei(en), 71 Bytes


1 Datei(en), 77.823.270 Bytes


1 Datei(en), 396.966 Bytes


1 Datei(en), 19.570.085 Bytes

How could I resolve this?

Thank you

AeroXuk commented 6 years ago

./configure automatically adds /libpostal on the end, so $DATA_DIR/home/User/libpostal/data becomes $DATA_DIR/home/User/libpostal/data/libpostal. This is where the library now looks for libpostals data files by default.

Goof2018 commented 6 years ago

Thank you very much. That solves one problem.

When I use the C:\msys64\home\User\libpostal\src\address_parser.exe

Quatre vingt douze Ave des Champs-Élysées

I get the following error. How could I fix this?

WARN invalid UTF-8 at transliterate (transliterate.c:791) errno:No such file or directory WARN invalid UTF-8 at transliterate (transliterate.c:791) errno:No such file or directory WARN invalid UTF-8 at transliterate (transliterate.c:791) errno:No such file or directory WARN invalid UTF-8 at transliterate (transliterate.c:791) errno:No such file or directory

AeroXuk commented 6 years ago

This is a problem with the way the windows command line works rather then LibPostal:

Loading models...

Welcome to libpostal's address parser.

Type in any address to parse and print the result.

Special commands:
.exit to quit the program

> 10 Downing Street, Westminster, London, SW1A 2AA


  "house_number": "10",
  "road": "downing street",
  "city_district": "westminster",
  "city": "london",
  "postcode": "sw1a 2aa"

> Quatre vingt douze Ave des Champs-Élysées
WARN  invalid UTF-8
 at transliterate (transliterate.c:791) errno:No such file or directory
WARN  invalid UTF-8
 at transliterate (transliterate.c:791) errno:No such file or directory
WARN  invalid UTF-8
 at transliterate (transliterate.c:791) errno:No such file or directory
WARN  invalid UTF-8
 at transliterate (transliterate.c:791) errno:No such file or directory

I have a C# .NET library for using libPostal on windows which correctly handles UTF-8 characters ( The following C# code:

string exampleAddress = "Quatre vingt douze Ave des Champs-Élysées";

LibPostal libPostal = LibPostal.GetInstance();

Console.WriteLine("Test Parse:");
var addressParserOptions = libPostal.GetAddressParserDefaultOptions();
using (var responce = libPostal.ParseAddress(exampleAddress, addressParserOptions))
    foreach (var x in responce.Results)
        Console.WriteLine("{0}: {1}", x.Key, x.Value);

Console.WriteLine("Test Expand:");
var normaliseOptions = libPostal.GetAddressExpansionDefaultOptions();
using (var expand = libPostal.ExpandAddress(exampleAddress, normaliseOptions))
    foreach (var x in expand.Expansions)


Test Parse:
road: quatre vingt douze ave des champs-élysées

Test Expand:
92 avenue des champs-elysees
92 avenue des champs elysees
AeroXuk commented 6 years ago

Using the .Net Console.ReadLine() seems to handle UTF-8 characters better. Here is a simplified version of address_parser.exe in C#:

using LibPostalNet;
using System;

namespace AddressParser
    internal class Program
        private static void Main(string[] args)
            string address_parser_dir = null;

            if (args.Length > 0)
                address_parser_dir = args[0];

            Console.WriteLine("Loading models...");

            LibPostal libPostal = LibPostal.GetInstance(address_parser_dir);

            if (!libPostal.IsParserLoaded)
                Console.Write("Failure while loading.");

            Console.WriteLine("Welcome to libpostal's address parser.");
            Console.WriteLine("Type in any address to parse and print the result.");
            Console.WriteLine("Special commands:");
            Console.WriteLine(".exit to quit the program");

            string input = string.Empty;
            while (true)
                Console.Write("> ");
                input = Console.ReadLine();

                // TODO: Add .language & .country support
                if (string.Equals(input, ".exit", StringComparison.InvariantCultureIgnoreCase))
                else if (string.Equals(input, ".print_features", StringComparison.InvariantCultureIgnoreCase))
                    libPostal.PrintFeatures = true;
                else if (input.Length < 1)

                var options = libPostal.GetAddressParserDefaultOptions();
                using (var parsed = libPostal.ParseAddress(input, options))

AeroXuk commented 6 years ago

Actually, think I've found an easier answer for you (

Run this command before starting address_parser.exe to switch the command prompt code page to UTF-8:

chcp 65001
Goof2018 commented 6 years ago

Thank you very much. I'll give it a try.

jbelien commented 5 years ago

Hello, I ran into the same issue:

debian@development:~/libpostal/src$ ./address_parser
Loading models...
ERR   Error loading transliteration module, dir=(null)
   at libpostal_setup_datadir (libpostal.c:266) errno: No such file or directory


How can I fix this ? It seems that @Goof2018 succeeded to fix this one but doesn't explain how (and I'm on Debian 9).

Thanks a lot !

jbelien commented 5 years ago

UPDATE: I deleted everything and made a fresh install and it seems to be fixed (I run into errno: Cannot allocate memory now but I guess it's not related) !

Good thing to know (

./configure automatically adds /libpostal on the end, so $DATA_DIR/home/User/libpostal/data becomes $DATA_DIR/home/User/libpostal/data/libpostal. This is where the library now looks for libpostals data files by default.