dalab / pboh-entity-linking

Source code for the paper "Probabilistic Bag-Of-Hyperlinks Model for Entity Linking" , http://dl.acm.org/citation.cfm?id=2882988
58 stars 15 forks source link

scala -J-Xmx90g target/PBoH-1.0-SNAPSHOT-jar-with-dependencies.jar testPBOHOnAllDatasets max-product #6

Open YolandaRay opened 6 years ago

YolandaRay commented 6 years ago

I download the above indexes and update their locations inside the code and compile with 'mvn package'.

After I run "scala -J-Xmx90g target/PBoH-1.0-SNAPSHOT-jar-with-dependencies.jar testPBOHOnAllDatasets max-product",

tihs message comes from console: Loaded mention index. Size = 21540764 Loading word freq index p(w) from /home/scl/nel/media/hofmann-scratch/other-data/Wikipedia/WikipediaPlainText/textFromAllWikipedia2014Feb.txt_word_frequencies... Done loading word freq index p(w). Size = 582355 java.lang.UnsatisfiedLinkError: org.fusesource.leveldbjni.internal.NativeOptions.init()V at org.fusesource.leveldbjni.internal.NativeOptions.init(Native Method) at org.fusesource.leveldbjni.internal.NativeOptions.(NativeOptions.java:54) at org.fusesource.leveldbjni.JniDBFactory$OptionsResourceHolder.init(JniDBFactory.java:98) at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:167) at index.WordEntityProbsIndex.(WordEntityProbsIndex.scala:27) at index.AllIndexesBox.(AllIndexesBox.scala:69) at eval.AllIndexesForEval$.(EvalOnDatasets.scala:22) at eval.AllIndexesForEval$.(EvalOnDatasets.scala) at eval.EvalOnDatasets$.evalOneDoc(EvalOnDatasets.scala:149) at eval.EvalOnDatasets$$anonfun$1.apply(EvalOnDatasets.scala:129) at eval.EvalOnDatasets$$anonfun$1.apply(EvalOnDatasets.scala:129) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108) at eval.EvalOnDatasets$.evalOneDatasetInParallel(EvalOnDatasets.scala:129) at eval.EvalOnDatasets$.evalAllDatasets(EvalOnDatasets.scala:70) at el.EL_LBP_Spark$.main(EL_LBP_Spark.scala:122) at el.EL_LBP_Spark.main(EL_LBP_Spark.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at scala.tools.nsc.util.ScalaClassLoader$$anonfun$run$1.apply(ScalaClassLoader.scala:71) at scala.tools.nsc.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31) at scala.tools.nsc.util.ScalaClassLoader$URLClassLoader.asContext(ScalaClassLoader.scala:139) at scala.tools.nsc.util.ScalaClassLoader$class.run(ScalaClassLoader.scala:71) at scala.tools.nsc.util.ScalaClassLoader$URLClassLoader.run(ScalaClassLoader.scala:139) at scala.tools.nsc.CommonRunner$class.run(ObjectRunner.scala:28) at scala.tools.nsc.JarRunner$.run(MainGenericRunner.scala:16) at scala.tools.nsc.CommonRunner$class.runAndCatch(ObjectRunner.scala:35) at scala.tools.nsc.JarRunner$.runJar(MainGenericRunner.scala:28) at scala.tools.nsc.MainGenericRunner.runTarget$1(MainGenericRunner.scala:78) at scala.tools.nsc.MainGenericRunner.process(MainGenericRunner.scala:96) at scala.tools.nsc.MainGenericRunner$.main(MainGenericRunner.scala:105) at scala.tools.nsc.MainGenericRunner.main(MainGenericRunner.scala)

Do you have any idea? Thanks best

octavian-ganea commented 6 years ago

This is a leveldb error. You can try to debug it (e.g. see https://github.com/fusesource/leveldbjni/issues/74) or you can just remove the usage of leveldb from src/main/scala/index/WordEntityProbsIndex.scala by commenting out the lines that contain entToWordFreqDb and uncommenting the lines that contain entToWordFreq. However, this would result in the full dictionary entToWordFreq being stored on the RAM, which might require a few tens of GB unfortunately, since these are all (entity, word) frequency pairs.