Closed gianvi closed 8 years ago
See issue #3. For Java with native modules the .so/.jniLib files built by the JNI extension itself need to be on java.library.path. They can be found in build/natives after running gradle assemble. This Gradle plugin might be helpful: https://github.com/cjstehno/coffeaelectronica/wiki/Going-Native-with-Gradle
Tnx @thatdatabaseguy...now it's correctly working! Btw, parser and expander are awesome and... I'm deal now with the "language" lack. There is an automatic selection, or some specific module/strategy to specify it ? There are some functional API? ...sorry for all these questions but is very difficult to find docs on it ...I'm trying to write an entity matcher in spark, in scala and I use a very noised db...with 70% of italian places and I was looking for the other functionalities and proeperty but is not simple to understand how for example the language can be choosed...
Bravissimo.
For expansion, if you know the language is Italian a priori and don't want to use the automatic language classifier, you can specify to use only the Italian dictionaries as follows:
String address = "V. Benedetta, no. 25 00153 Roma";
String[] languages = {"it"};
ExpanderOptions englishOptions = new ExpanderOptions.Builder().languages(languages).build();
AddressExpander expander = AddressExpander.getInstance();
String[] expansions = expander.expandAddressWithOptions(address, options);
Note that you may want to use the automatic language classification anyway if some of the addresses are in Valle d'Aosta or Trentino-Alto Adige because some of the street names could be in French or German respectively, at least in the data sets I've seen.
For parsing, I'd need to look at the specific mistakes it's making. The current version of libpostal is trained on ~2.7M addresses in Italy. There's a new parser being developed that's trained on > 3M Italian addresses as well as simple place queries which help with parsing most of the tiny "località" that one might see in more rural addresses. The new parser also randomly appends sub-building information to the addresses so it can parse phrases like "pº 2" or "sala 123" as well as "casella postale" addresses. You may be interested in checking that out when it's released into master. The parser will probably never achieve 100% accuracy, but the next release is a major improvement nonetheless.
HI dbguy, and thanks for support. Try to ask u some more questions about libpostal: as I told u I'm working with it to do an entity matcher in spark (that I hope to release soon :) )...btw I'm using a DB with entity addressed and featurized with very noise and sparsed data, an example "row" can be like this: | shopid | name| cap | city | indir|concat(indir, ' ', city, ' ', cap)| expanded_address| |27e0d398-c1a6-...|Toorr | 8001|Z�rich| | Z�rich 8001 | z�rich 8001|
Now after expansion and parsification (and other features extraction techs.) with libpostal I obtain this... | shopid| ... |test_house|test_house_number| test_road|test_neighbourhood|test_suburb|test_postcode|test_city|test_state|test_country|test_country_code|test_city_district|test_state_district|test_altro|
|27e0d398-c1a6-421...| Z�rich 8001| z�rich 8001|[z�rich,null,null...| z�rich| null| null| null| null| 8001| null| null| null| null| null| null| null| gucci|
Well...here the problem is that Zurich (the city) is recognized as Road. Now I'm wondering if I missing some part before parsification (I'm a sure of this cause I'm still dont know very well libpostal internallly) and so my first question is: 1) there is in libpostal some dictionary as cities or something like that (in addition to the national word street dictionaries that I already looked up in libpostal)? 2) How to code for example parsification with given "city" or given nationality? (I suppose there is a way to specifiy input "compenent_part" already known)
Thanks in advance for support and as I asked u before, any suggestions, link , api or docs on libpostal are very appreciate!
Hm, for some reason I'm seeing very few postcodes in the final training data for Switzerland, though they definitely exist in OpenStreetMap. That's likely the source of the problem i.e. as far as the model knows from what it's been given, "Zurich" followed by a number is actually more likely to be a road than a city + postcode (Uruguay, Spain, etc. where it's technically "Calle Zurich" but the "Calle" is usually omitted).
I'll look into this for the next parser release. Probably something simple. It might be affecting some other countries as well, though I've personally spot-checked the training data in 30-40 countries and this is the first time I've seen this issue.
To quickly answer your two questions.
Hi i need this library working in my java program but it has generated the following error:
run:
Exception in thread "main" java.lang.UnsatisfiedLinkError: no jpostal_expander in java.library.path
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867)
at java.lang.Runtime.loadLibrary0(Runtime.java:870)
at java.lang.System.loadLibrary(System.java:1122)
at com.mapzen.jpostal.AddressExpander.
See issue #3. The .so/.jniLib files built by the JNI extension also need to be on java.library.path.
Where can i get that extention?
I am running this java program in Netbeans in Windows platform, not in linux. build.gradle contains
task buildJniLib(type:Exec) { commandLine './build.sh' } where ./build.sh is only availbale in Linux
libpostal only recently added Windows support, and there are a few conditions currently. You can now build the C library (https://github.com/openvenues/libpostal) with MSYS2/MinGW64 or WSL, but not sure about other environments/compilers. That should work for the JNI extensions here as well, since it's also an autotools build.
@AeroXuk has been working on Windows support for the C library as well as the C# bindings, and can likely answer most questions related to running libpostal on Windows.
////////////////// MAIN CLASS ///////////////////////// package com.mapzen.jpostal; public class Library { public static void main(String args[]){ AddressParser p = AddressParser.getInstance(); ParsedComponent[] components = p.parseAddress("The Book Club 100-106 Leonard St, Shoreditch, London, Greater London, EC2A 4RH, United Kingdom");
for (ParsedComponent c : components) {
System.out.printf("%s: %s\n", c.getLabel(), c.getValue());
}
}} ///////////////////////////// BUILD.GRADLE /////////////////////////// apply plugin: 'application'
repositories { mavenCentral() }
task buildJniLib(type:Exec) { commandLine './build.sh' }
compileJava.dependsOn(buildJniLib)
dependencies { testCompile 'junit:junit:4.+' }
tasks.withType(Test) { systemProperty "java.library.path", "src/main/jniLibs" } mainClassName = "com.mapzen.jpostal.Library"
////////////////////////////////////ERROR GENERATED/////////////////////////////////////////////
Exception in thread "main" java.lang.UnsatisfiedLinkError: no jpostal_parser in java.library.path
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867)
at java.lang.Runtime.loadLibrary0(Runtime.java:870)
at java.lang.System.loadLibrary(System.java:1122)
at com.mapzen.jpostal.AddressParser.
//////////////////////////////// src/main/jniLibs FOLDER CONTAINS //////////////////////////////// 1- libpostal_expander --------> with extenstions (.so, .la, .so.0 and .so.0.0.0) 2- libpostal_parser --------> with extenstions (.so, .la, .so.0 and .so.0.0.0)
//////////////////////////////////////MY PATH VARIABLE ////////////////////////////////////////////
$PKG_CONFIG_PATH /usr/local/lib/pkgconfig/libpostal.pc
PLEASE SOMEONE HELP ME RESOLVE THIS PROBLEM
Hi, I'm trying to execute the package but I got this errors (after a lot of work to build and compile everything from libpostal!!). What I've done is:
Hello JPostal! Exception in thread "main" java.lang.UnsatisfiedLinkError: no jpostal_expander in java.library.path at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867) at java.lang.Runtime.loadLibrary0(Runtime.java:870) at java.lang.System.loadLibrary(System.java:1122) at MerchantBuilder.examples.AddressExpander.(AddressExpander.java:7)
at MerchantBuilder.TestJPostal.main(TestJPostal.java:24)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)
Please can u help me with this? I really want to explore the possibilities of LibPostal in NLP workflows...but it's really hard to set up all the environment, cause to the fact there is no sbt/maven inclusion!! Thanks