dennybritz / deepdive

36 stars 9 forks source link

Example issue: Java heap space... #68

Closed zifeishan closed 10 years ago

zifeishan commented 10 years ago

I got "OutOfMemoryError" when using nlp_extractor on my Mac.. Nowhere in the Walkthrough that indicates how to fix this.

20:39:06.675 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

Full log:


spouse_example (master) $ ./run.sh
[info] Loading project definition from /Users/Robin/Documents/repos/research/deepdive/project
[info] Set current project to deepdive (in build file:/Users/Robin/Documents/repos/research/deepdive/)
[info] Running org.deepdive.Main -c /Users/Robin/Documents/repos/research/deepdive/app/spouse_example/application.conf
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/Users/Robin/Documents/repos/research/deepdive/lib/sampler-assembly-0.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/Robin/.ivy2/cache/ch.qos.logback/logback-classic/jars/logback-classic-1.0.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [ch.qos.logback.classic.util.ContextSelectorStaticBinder]
20:38:47.479 [][][Slf4jLogger] INFO  Slf4jLogger started
20:38:47.499 [run-main-0][EventStream(akka://deepdive)][EventStream] DEBUG logger log1-Slf4jLogger started
20:38:47.501 [run-main-0][EventStream(akka://deepdive)][EventStream] DEBUG Default Loggers started
20:38:47.539 [run-main-0][Main$(akka://deepdive)][Main$] INFO  Running pipeline with configuration from /Users/Robin/Documents/repos/research/deepdive/app/spouse_example/application.conf
20:38:47.676 [run-main-0][JdbcDataStore$(akka://deepdive)][JdbcDataStore$] INFO  Intializing all JDBC data stores
20:38:47.902 [][][ConnectionPool$] DEBUG Registered connection pool : ConnectionPool(url:jdbc:postgresql://127.0.0.1/deepdive_spouse, user:Robin)
20:38:47.921 [default-dispatcher-2][profiler][Profiler] INFO  starting at akka://deepdive/user/profiler
20:38:47.922 [default-dispatcher-4][taskManager][TaskManager] INFO  starting at akka://deepdive/user/taskManager
20:38:47.956 [default-dispatcher-3][inferenceManager][InferenceManager$PostgresInferenceManager] INFO  Starting
20:38:47.975 [default-dispatcher-4][extractionManager][ExtractionManager$PostgresExtractionManager] INFO  starting
20:38:47.979 [default-dispatcher-5][factorGraphBuilder][FactorGraphBuilder$PostgresFactorGraphBuilder] INFO  Starting
20:38:47.993 [run-main-0][DeepDive$(akka://deepdive)][DeepDive$] INFO  Running pipeline=_default with tasks=List(ext_sentences, inference, calibration, report, shutdown)
20:38:48.002 [default-dispatcher-2][taskManager][TaskManager] INFO  Added task_id=ext_sentences
20:38:48.006 [default-dispatcher-2][taskManager][TaskManager] INFO  1/1 tasks eligible.
20:38:48.009 [default-dispatcher-2][taskManager][TaskManager] INFO  Tasks not_eligible: Set()
20:38:48.016 [default-dispatcher-2][taskManager][TaskManager] DEBUG Sending task_id=ext_sentences to Actor[akka://deepdive/user/extractionManager#1948892930]
20:38:48.023 [default-dispatcher-5][profiler][Profiler] DEBUG starting report_id=ext_sentences
20:38:48.024 [default-dispatcher-5][extractionManager][ExtractionManager$PostgresExtractionManager] INFO  Adding task_name=ext_sentences
20:38:48.028 [default-dispatcher-2][taskManager][TaskManager] INFO  Added task_id=inference
20:38:48.029 [default-dispatcher-2][taskManager][TaskManager] INFO  0/1 tasks eligible.
20:38:48.030 [default-dispatcher-2][taskManager][TaskManager] INFO  Tasks not_eligible: Set(inference)
20:38:48.031 [default-dispatcher-2][taskManager][TaskManager] INFO  Added task_id=calibration
20:38:48.032 [default-dispatcher-2][taskManager][TaskManager] INFO  0/2 tasks eligible.
20:38:48.038 [default-dispatcher-2][taskManager][TaskManager] INFO  Tasks not_eligible: Set(inference, calibration)
20:38:48.040 [default-dispatcher-2][taskManager][TaskManager] INFO  Added task_id=report
20:38:48.041 [default-dispatcher-2][taskManager][TaskManager] INFO  0/3 tasks eligible.
20:38:48.042 [default-dispatcher-2][taskManager][TaskManager] INFO  Tasks not_eligible: Set(inference, report, calibration)
20:38:48.043 [default-dispatcher-2][taskManager][TaskManager] INFO  Added task_id=shutdown
20:38:48.045 [default-dispatcher-2][taskManager][TaskManager] INFO  0/4 tasks eligible.
20:38:48.046 [default-dispatcher-2][taskManager][TaskManager] INFO  Tasks not_eligible: Set(shutdown, inference, report, calibration)
20:38:48.047 [default-dispatcher-5][extractionManager][ExtractionManager$PostgresExtractionManager] INFO  executing extractorName=ext_sentences
20:38:48.097 [][][ConnectionPool$] DEBUG Borrowed a new connection from ConnectionPool(url:jdbc:postgresql://127.0.0.1/deepdive_spouse, user:Robin)
20:38:48.107 [default-dispatcher-4][extractorRunner-ext_sentences][ExtractorRunner] INFO  waiting for task
20:38:48.126 [default-dispatcher-4][extractorRunner-ext_sentences][ExtractorRunner] INFO  Received task=ext_sentences. Executing
20:38:48.127 [default-dispatcher-4][extractorRunner-ext_sentences][ExtractorRunner] INFO  Executing before script.
20:38:48.128 [default-dispatcher-4][extractorRunner-ext_sentences][ExtractorRunner] INFO  Executing: "/Users/Robin/Documents/repos/research/deepdive/app/spouse_example/udf/before_sentences.sh"
20:38:48.245 [Thread-5][extractorRunner-ext_sentences][ExtractorRunner] INFO  NOTICE:  truncate cascades to table "people_mentions"
20:38:48.298 [Thread-4][extractorRunner-ext_sentences][ExtractorRunner] INFO  TRUNCATE TABLE
20:38:48.299 [default-dispatcher-4][extractorRunner-ext_sentences][ExtractorRunner] INFO  Starting 1 children process workers
20:38:48.370 [default-dispatcher-5][processExecutor1][ProcessExecutor] INFO  started
20:38:48.389 [default-dispatcher-5][processExecutor1][ProcessExecutor] INFO  starting process with cmd="/Users/Robin/Documents/repos/research/deepdive/app/spouse_example/udf/nlp_extractor/run.sh -k articles.id -v articles.text -l 20" and batch_size=50000
20:38:48.429 [default-dispatcher-6][extractorRunner-ext_sentences][ExtractorRunner] INFO  Getting data from the data store and sending it to the workers. query='DatastoreInputQuery(SELECT * FROM articles order by id asc limit 100)'
20:38:48.512 [][][ConnectionPool$] DEBUG Borrowed a new connection from ConnectionPool(url:jdbc:postgresql://127.0.0.1/deepdive_spouse, user:Robin)
20:38:49.751 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing with id_key="articles.id" value_key="articles.text" max_len=20 numThreads=4
20:38:49.941 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Adding annotator tokenize
20:38:49.955 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Adding annotator cleanxml
20:38:50.015 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Adding annotator ssplit
20:38:50.022 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Adding annotator pos
20:38:53.886 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Reading POS tagger model from edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger ... done [3.8 sec].
20:38:53.886 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Adding annotator lemma
20:38:53.888 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Adding annotator ner
20:39:06.675 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
20:39:06.675 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at java.util.Arrays.copyOfRange(Arrays.java:3209)
20:39:06.676 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at java.lang.String.<init>(String.java:215)
20:39:06.676 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at java.lang.StringBuilder.toString(StringBuilder.java:430)
20:39:06.677 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at java.io.ObjectInputStream$BlockDataInputStream.readUTFBody(ObjectInputStream.java:3047)
20:39:06.678 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at java.io.ObjectInputStream$BlockDataInputStream.readUTF(ObjectInputStream.java:2843)
20:39:06.678 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at java.io.ObjectInputStream.readString(ObjectInputStream.java:1617)
20:39:06.679 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1320)
20:39:06.679 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:349)
20:39:06.680 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at java.util.HashMap.readObject(HashMap.java:1029)
20:39:06.680 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
20:39:06.681 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
20:39:06.681 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
20:39:06.681 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at java.lang.reflect.Method.invoke(Method.java:597)
20:39:06.682 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:979)
20:39:06.682 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1873)
20:39:06.683 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1777)
20:39:06.683 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
20:39:06.683 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1970)
20:39:06.684 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1895)
20:39:06.684 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1777)
20:39:06.685 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
20:39:06.685 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:349)
20:39:06.686 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at edu.stanford.nlp.ie.crf.CRFClassifier.loadClassifier(CRFClassifier.java:2601)
20:39:06.686 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1622)
20:39:06.687 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1677)
20:39:06.687 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1664)
20:39:06.687 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at edu.stanford.nlp.ie.crf.CRFClassifier.getClassifier(CRFClassifier.java:2832)
20:39:06.688 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at edu.stanford.nlp.ie.ClassifierCombiner.loadClassifierFromPath(ClassifierCombiner.java:189)
20:39:06.688 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at edu.stanford.nlp.ie.ClassifierCombiner.loadClassifiers(ClassifierCombiner.java:173)
20:39:06.689 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at edu.stanford.nlp.ie.ClassifierCombiner.<init>(ClassifierCombiner.java:113)
20:39:06.689 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at edu.stanford.nlp.ie.NERClassifierCombiner.<init>(NERClassifierCombiner.java:64)
20:39:06.690 [Thread-8][processExecutor1][ProcessExecutor] DEBUG    at edu.stanford.nlp.pipeline.StanfordCoreNLP$6.create(StanfordCoreNLP.java:624)
20:39:07.105 [Thread-7][processExecutor1][ProcessExecutor] DEBUG closing output stream
20:39:07.108 [default-dispatcher-3][processExecutor1][ProcessExecutor] INFO  process exited with exit_value=1
20:39:07.117 [default-dispatcher-4][profiler][Profiler] DEBUG ending report_id=ext_sentences
20:39:07.118 [default-dispatcher-7][taskManager][TaskManager] INFO  Completed task_id=ext_sentences with Failure(java.lang.RuntimeException: process exited with exit_code=1)
20:39:07.130 [default-dispatcher-7][taskManager][TaskManager] ERROR task=ext_sentences Failed: java.lang.RuntimeException: process exited with exit_code=1
20:39:07.131 [default-dispatcher-7][taskManager][TaskManager] ERROR Forcing shutdown
20:39:07.146 [default-dispatcher-7][taskManager][TaskManager] ERROR Cancelling task=inference
20:39:07.147 [default-dispatcher-7][taskManager][TaskManager] INFO  1/3 tasks eligible.
20:39:07.147 [default-dispatcher-7][taskManager][TaskManager] INFO  Tasks not_eligible: Set(shutdown, report)
20:39:07.148 [default-dispatcher-7][taskManager][TaskManager] DEBUG Sending task_id=calibration to Actor[akka://deepdive/user/inferenceManager#1095405382]
20:39:07.152 [default-dispatcher-4][profiler][Profiler] DEBUG starting report_id=calibration
20:39:07.155 [default-dispatcher-3][processExecutor1][LocalActorRef] INFO  Message [akka.dispatch.sysmsg.Terminate] from Actor[akka://deepdive/user/extractionManager/extractorRunner-ext_sentences/processExecutor1#-161277400] to Actor[akka://deepdive/user/extractionManager/extractorRunner-ext_sentences/processExecutor1#-161277400] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
20:39:07.156 [default-dispatcher-2][inferenceManager][InferenceManager$PostgresInferenceManager] INFO  writing calibration data
20:39:07.156 [default-dispatcher-6][extractorRunner-ext_sentences][ExtractorRunner] DEBUG all data was sent to workers.
20:39:07.157 [default-dispatcher-3][processExecutor1][LocalActorRef] INFO  Message [org.deepdive.extraction.ProcessExecutor$CloseInputStream$] from Actor[akka://deepdive/user/extractionManager/extractorRunner-ext_sentences#-524478237] to Actor[akka://deepdive/user/extractionManager/extractorRunner-ext_sentences/processExecutor1#-161277400] was not delivered. [2] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
20:39:07.160 [default-dispatcher-5][extractorRunner-ext_sentences][LocalActorRef] INFO  Message [akka.dispatch.sysmsg.DeathWatchNotification] from Actor[akka://deepdive/user/extractionManager/extractorRunner-ext_sentences#-524478237] to Actor[akka://deepdive/user/extractionManager/extractorRunner-ext_sentences#-524478237] was not delivered. [3] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
20:39:07.162 [default-dispatcher-10][$a][CalibrationDataWriter] INFO  starting
20:39:07.164 [default-dispatcher-10][profiler][Profiler] DEBUG ending report_id=calibration
20:39:07.165 [default-dispatcher-5][taskManager][TaskManager] INFO  Completed task_id=calibration with Success(Set())
20:39:07.165 [default-dispatcher-5][taskManager][TaskManager] INFO  1/2 tasks eligible.
20:39:07.166 [default-dispatcher-5][taskManager][TaskManager] INFO  Tasks not_eligible: Set(shutdown)
20:39:07.166 [default-dispatcher-5][taskManager][TaskManager] DEBUG Sending task_id=report to Actor[akka://deepdive/user/profiler#1407730446]
20:39:07.166 [default-dispatcher-5][profiler][Profiler] DEBUG starting report_id=report
20:39:07.167 [default-dispatcher-5][profiler][Profiler] INFO  --------------------------------------------------
20:39:07.167 [default-dispatcher-5][profiler][Profiler] INFO  Summary Report
20:39:07.168 [default-dispatcher-5][profiler][Profiler] INFO  --------------------------------------------------
20:39:07.169 [default-dispatcher-5][profiler][Profiler] INFO  ext_sentences FAILURE [19103 ms]
20:39:07.170 [default-dispatcher-5][profiler][Profiler] INFO  calibration SUCCESS [10 ms]
20:39:07.170 [default-dispatcher-5][profiler][Profiler] INFO  --------------------------------------------------
20:39:07.171 [default-dispatcher-2][profiler][Profiler] DEBUG ending report_id=report
20:39:07.172 [default-dispatcher-5][taskManager][TaskManager] INFO  Completed task_id=report with Success(Success(()))
20:39:07.172 [default-dispatcher-5][taskManager][TaskManager] INFO  1/1 tasks eligible.
20:39:07.173 [default-dispatcher-5][taskManager][TaskManager] INFO  Tasks not_eligible: Set()
20:39:07.173 [default-dispatcher-5][taskManager][TaskManager] DEBUG Sending task_id=shutdown to Actor[akka://deepdive/user/taskManager#-1476724719]
20:39:07.174 [default-dispatcher-4][profiler][Profiler] DEBUG starting report_id=shutdown
20:39:07.175 [default-dispatcher-2][taskManager][RepointableActorRef] INFO  Message [akka.dispatch.sysmsg.Terminate] from Actor[akka://deepdive/user/taskManager#-1476724719] to Actor[akka://deepdive/user/taskManager#-1476724719] was not delivered. [4] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
20:39:07.175 [default-dispatcher-5][$a][LocalActorRef] INFO  Message [akka.dispatch.sysmsg.Terminate] from Actor[akka://deepdive/user/inferenceManager/$a#1307832078] to Actor[akka://deepdive/user/inferenceManager/$a#1307832078] was not delivered. [5] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
20:39:07.179 [default-dispatcher-6][EventStream][EventStream] DEBUG shutting down: StandardOutLogger started
feiranwang commented 10 years ago

Seems data too large? I limit the input to 50 in sentence extraction, and it worked. I got the same error when using the full data.

dennybritz commented 10 years ago
DEBUG Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

Process runs out of memory when loading the classifiers. Guess you'll need to make the heap space (-Xmx) larger in the run script for the NLP extractor (not the run script in the application). Wasn't necessary for me...

zifeishan commented 10 years ago

Doesn't work for me...

On Feb 5, 2014, at 8:48 PM, Feiran Wang notifications@github.com wrote:

Seems data too large? I limit the input to 50 in sentence extraction, and it worked.

— Reply to this email directly or view it on GitHub.

Zifei Shan M.S. student in Computer Science, Stanford University (2015) B.S. in Computer Science, Peking University (2013)

zhangce commented 10 years ago

Java version? it sounds weird, but older java does use more memory...

Ce

On Wed, Feb 5, 2014 at 8:53 PM, Denny Britz notifications@github.comwrote:

DEBUG Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

Process runs out of memory when loading the classifiers. Guess you'll need to make the heap space (-Xmx) larger in the run script for the NLP extractor (not the run script in the application). Wasn't necessary for me...

Reply to this email directly or view it on GitHubhttps://github.com/dennybritz/deepdive/issues/68#issuecomment-34292367 .

dennybritz commented 10 years ago

How much memory do you have on your machine? By default Java uses some fraction of the total available memory. I have 8Gb, if you have less maybe that's why it wasn't necessary for me...

zifeishan commented 10 years ago

I think so, but I have no idea where to specify this -Xmx argument... There's no explicit call to Java in run.sh.

$ ls udf/nlp_extractor/
README.md build.sbt project   run.sh    src       target

On Feb 5, 2014, at 8:53 PM, Denny Britz notifications@github.com wrote:

DEBUG Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... Exception in thread "main" java.lang.OutOfMemoryError: Java heap space Process runs out of memory when loading the classifiers. Guess you'll need to make the heap space (-Xmx) larger in the run script for the NLP extractor (not the run script in the application). Wasn't necessary for me...

— Reply to this email directly or view it on GitHub.

Zifei Shan M.S. student in Computer Science, Stanford University (2015) B.S. in Computer Science, Peking University (2013)

zifeishan commented 10 years ago

I have 4GB on my machine. Feiran, how about you?

On Feb 5, 2014, at 8:57 PM, Denny Britz notifications@github.com wrote:

How much memory do you have on your machine? By default Java uses some fraction of the total available memory. I have 8Gb, if you have less maybe that's why it wasn't necessary for me...

— Reply to this email directly or view it on GitHub.

Zifei Shan M.S. student in Computer Science, Stanford University (2015) B.S. in Computer Science, Peking University (2013)

dennybritz commented 10 years ago

export JAVA_OPTS="-Xmx4g"

dennybritz commented 10 years ago

Maybe check how low you can go, e.g. 2g or 1g, and see which one still works..

feiranwang commented 10 years ago

@zifeishan I only got 2g... did you remove the tables and pipelines before re-running nlp?

feiranwang commented 10 years ago

@zifeishan What if you set the limit to 10...

zifeishan commented 10 years ago

I did, but the problem is that I just cannot start the nlp_extractor.

It works for me when I do export JAVA_OPTS="-Xmx4g" in run.sh.

I'll check how low it can go...

On Feb 5, 2014, at 9:01 PM, Feiran Wang notifications@github.com wrote:

@zifeishan I only got 2g... did you remove the tables and pipelines before re-running nlp?

— Reply to this email directly or view it on GitHub.

Zifei Shan M.S. student in Computer Science, Stanford University (2015) B.S. in Computer Science, Peking University (2013)

zifeishan commented 10 years ago

The parameter -Xmx4g works for 10 documents limit, but errors occur when doing 50 documents.

Full log:

spouse_example (master) $ ./run.sh
[info] Loading project definition from /Users/Robin/Documents/repos/research/deepdive/project
[info] Set current project to deepdive (in build file:/Users/Robin/Documents/repos/research/deepdive/)
[info] Running org.deepdive.Main -c /Users/Robin/Documents/repos/research/deepdive/app/spouse_example/application.conf
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/Users/Robin/Documents/repos/research/deepdive/lib/sampler-assembly-0.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/Robin/.ivy2/cache/ch.qos.logback/logback-classic/jars/logback-classic-1.0.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [ch.qos.logback.classic.util.ContextSelectorStaticBinder]
21:09:22.047 [][][Slf4jLogger] INFO  Slf4jLogger started
21:09:22.072 [run-main-0][EventStream(akka://deepdive)][EventStream] DEBUG logger log1-Slf4jLogger started
21:09:22.074 [run-main-0][EventStream(akka://deepdive)][EventStream] DEBUG Default Loggers started
21:09:22.088 [run-main-0][Main$(akka://deepdive)][Main$] INFO  Running pipeline with configuration from /Users/Robin/Documents/repos/research/deepdive/app/spouse_example/application.conf
21:09:22.214 [run-main-0][JdbcDataStore$(akka://deepdive)][JdbcDataStore$] INFO  Intializing all JDBC data stores
21:09:22.390 [][][ConnectionPool$] DEBUG Registered connection pool : ConnectionPool(url:jdbc:postgresql://127.0.0.1/deepdive_spouse, user:Robin)
21:09:22.408 [default-dispatcher-4][taskManager][TaskManager] INFO  starting at akka://deepdive/user/taskManager
21:09:22.408 [default-dispatcher-2][profiler][Profiler] INFO  starting at akka://deepdive/user/profiler
21:09:22.428 [default-dispatcher-4][inferenceManager][InferenceManager$PostgresInferenceManager] INFO  Starting
21:09:22.452 [default-dispatcher-2][extractionManager][ExtractionManager$PostgresExtractionManager] INFO  starting
21:09:22.454 [default-dispatcher-3][factorGraphBuilder][FactorGraphBuilder$PostgresFactorGraphBuilder] INFO  Starting
21:09:22.475 [run-main-0][DeepDive$(akka://deepdive)][DeepDive$] INFO  Running pipeline=_default with tasks=List(ext_people, ext_sentences, inference, calibration, report, shutdown)
21:09:22.481 [default-dispatcher-3][taskManager][TaskManager] INFO  Added task_id=ext_people
21:09:22.486 [default-dispatcher-3][taskManager][TaskManager] INFO  0/1 tasks eligible.
21:09:22.488 [default-dispatcher-3][taskManager][TaskManager] INFO  Tasks not_eligible: Set(ext_people)
21:09:22.491 [default-dispatcher-3][taskManager][TaskManager] INFO  Added task_id=ext_sentences
21:09:22.492 [default-dispatcher-3][taskManager][TaskManager] INFO  1/2 tasks eligible.
21:09:22.493 [default-dispatcher-3][taskManager][TaskManager] INFO  Tasks not_eligible: Set(ext_people)
21:09:22.494 [default-dispatcher-3][taskManager][TaskManager] DEBUG Sending task_id=ext_sentences to Actor[akka://deepdive/user/extractionManager#1773597743]
21:09:22.500 [default-dispatcher-5][extractionManager][ExtractionManager$PostgresExtractionManager] INFO  Adding task_name=ext_sentences
21:09:22.503 [default-dispatcher-3][taskManager][TaskManager] INFO  Added task_id=inference
21:09:22.504 [default-dispatcher-3][taskManager][TaskManager] INFO  0/2 tasks eligible.
21:09:22.505 [default-dispatcher-3][taskManager][TaskManager] INFO  Tasks not_eligible: Set(ext_people, inference)
21:09:22.506 [default-dispatcher-3][taskManager][TaskManager] INFO  Added task_id=calibration
21:09:22.507 [default-dispatcher-3][taskManager][TaskManager] INFO  0/3 tasks eligible.
21:09:22.510 [default-dispatcher-3][taskManager][TaskManager] INFO  Tasks not_eligible: Set(ext_people, inference, calibration)
21:09:22.512 [default-dispatcher-3][taskManager][TaskManager] INFO  Added task_id=report
21:09:22.513 [default-dispatcher-3][taskManager][TaskManager] INFO  0/4 tasks eligible.
21:09:22.514 [default-dispatcher-3][taskManager][TaskManager] INFO  Tasks not_eligible: Set(ext_people, inference, report, calibration)
21:09:22.515 [default-dispatcher-3][taskManager][TaskManager] INFO  Added task_id=shutdown
21:09:22.516 [default-dispatcher-6][profiler][Profiler] DEBUG starting report_id=ext_sentences
21:09:22.517 [default-dispatcher-3][taskManager][TaskManager] INFO  0/5 tasks eligible.
21:09:22.519 [default-dispatcher-3][taskManager][TaskManager] INFO  Tasks not_eligible: Set(calibration, ext_people, inference, shutdown, report)
21:09:22.519 [default-dispatcher-5][extractionManager][ExtractionManager$PostgresExtractionManager] INFO  executing extractorName=ext_sentences
21:09:22.573 [][][ConnectionPool$] DEBUG Borrowed a new connection from ConnectionPool(url:jdbc:postgresql://127.0.0.1/deepdive_spouse, user:Robin)
21:09:22.577 [default-dispatcher-8][extractorRunner-ext_sentences][ExtractorRunner] INFO  waiting for task
21:09:22.591 [default-dispatcher-8][extractorRunner-ext_sentences][ExtractorRunner] INFO  Received task=ext_sentences. Executing
21:09:22.593 [default-dispatcher-8][extractorRunner-ext_sentences][ExtractorRunner] INFO  Executing before script.
21:09:22.593 [default-dispatcher-8][extractorRunner-ext_sentences][ExtractorRunner] INFO  Executing: "/Users/Robin/Documents/repos/research/deepdive/app/spouse_example/udf/before_sentences.sh"
21:09:22.666 [Thread-5][extractorRunner-ext_sentences][ExtractorRunner] INFO  NOTICE:  truncate cascades to table "people_mentions"
21:09:22.667 [Thread-5][extractorRunner-ext_sentences][ExtractorRunner] INFO  NOTICE:  truncate cascades to table "has_spouse"
21:09:22.681 [Thread-4][extractorRunner-ext_sentences][ExtractorRunner] INFO  TRUNCATE TABLE
21:09:22.682 [default-dispatcher-8][extractorRunner-ext_sentences][ExtractorRunner] INFO  Starting 1 children process workers
21:09:22.715 [default-dispatcher-5][processExecutor1][ProcessExecutor] INFO  started
21:09:22.718 [default-dispatcher-5][processExecutor1][ProcessExecutor] INFO  starting process with cmd="/Users/Robin/Documents/repos/research/deepdive/app/spouse_example/udf/nlp_extractor/run.sh -k articles.id -v articles.text -l 20" and batch_size=50000
21:09:22.742 [default-dispatcher-6][extractorRunner-ext_sentences][ExtractorRunner] INFO  Getting data from the data store and sending it to the workers. query='DatastoreInputQuery(SELECT * FROM articles order by id asc limit 50)'
21:09:22.777 [][][ConnectionPool$] DEBUG Borrowed a new connection from ConnectionPool(url:jdbc:postgresql://127.0.0.1/deepdive_spouse, user:Robin)
21:09:23.807 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing with id_key="articles.id" value_key="articles.text" max_len=20 numThreads=4
21:09:23.950 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Adding annotator tokenize
21:09:23.960 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Adding annotator cleanxml
21:09:24.017 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Adding annotator ssplit
21:09:24.022 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Adding annotator pos
21:09:25.855 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Reading POS tagger model from edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger ... done [1.8 sec].
21:09:25.856 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Adding annotator lemma
21:09:25.857 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Adding annotator ner
21:09:31.226 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [5.3 sec].
21:09:34.537 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [3.3 sec].
21:09:37.938 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [3.4 sec].
21:09:38.185 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Reading TokensRegex rules from edu/stanford/nlp/models/sutime/defs.sutime.txt
21:09:38.267 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Reading TokensRegex rules from edu/stanford/nlp/models/sutime/english.sutime.txt
21:09:39.078 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Feb 5, 2014 9:09:39 PM edu.stanford.nlp.ling.tokensregex.CoreMapExpressionExtractor appendRules
21:09:39.079 [Thread-8][processExecutor1][ProcessExecutor] DEBUG INFO: Ignoring inactive rule: null
21:09:39.080 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Feb 5, 2014 9:09:39 PM edu.stanford.nlp.ling.tokensregex.CoreMapExpressionExtractor appendRules
21:09:39.081 [Thread-8][processExecutor1][ProcessExecutor] DEBUG INFO: Ignoring inactive rule: temporal-composite-8:ranges
21:09:39.082 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Reading TokensRegex rules from edu/stanford/nlp/models/sutime/english.holidays.sutime.txt
21:09:39.090 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Initializing JollyDayHoliday for sutime with classpath:edu/stanford/nlp/models/sutime/jollyday/Holidays_sutime.xml
21:09:39.437 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Reading TokensRegex rules from edu/stanford/nlp/models/sutime/defs.sutime.txt
21:09:39.471 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Reading TokensRegex rules from edu/stanford/nlp/models/sutime/english.sutime.txt
21:09:39.625 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Feb 5, 2014 9:09:39 PM edu.stanford.nlp.ling.tokensregex.CoreMapExpressionExtractor appendRules
21:09:39.625 [Thread-8][processExecutor1][ProcessExecutor] DEBUG INFO: Ignoring inactive rule: null
21:09:39.626 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Feb 5, 2014 9:09:39 PM edu.stanford.nlp.ling.tokensregex.CoreMapExpressionExtractor appendRules
21:09:39.626 [Thread-8][processExecutor1][ProcessExecutor] DEBUG INFO: Ignoring inactive rule: temporal-composite-8:ranges
21:09:39.627 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Reading TokensRegex rules from edu/stanford/nlp/models/sutime/english.holidays.sutime.txt
21:09:39.633 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Adding annotator parse
21:09:41.041 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ... done [1.4 sec].
21:09:41.042 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Adding annotator dcoref
21:09:57.579 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 279...
21:10:06.032 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 345...
21:10:09.309 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 390...
21:10:16.884 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 766...
21:10:22.503 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 839...
21:10:25.429 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 1001...
21:10:26.858 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 1145...
21:10:30.127 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 1316...
21:10:33.188 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 1387...
21:10:36.658 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 1405...
21:10:39.119 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 1462...
21:10:41.326 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 1491...
21:10:44.409 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 1556...
21:10:48.641 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 1686...
21:10:52.618 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 1882...
21:10:54.914 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 1938...
21:10:57.052 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 1986...
21:10:59.707 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 2131...
21:11:02.265 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 2132...
21:11:06.721 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 2190...
21:11:11.203 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 2293...
21:11:14.589 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 2573...
21:11:17.985 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 3410...
21:11:19.028 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 3468...
21:11:20.999 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 3704...
21:11:24.806 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 3720...
21:11:28.044 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 3845...
21:11:31.314 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 3856...
21:11:35.169 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 3921...
21:11:37.665 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 4093...
21:11:39.243 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 4285...
21:11:41.539 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 4301...
21:11:43.376 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 4369...
21:11:47.494 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 4434...
21:11:48.422 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 4484...
21:11:52.329 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 5602...
21:11:53.996 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 5814...
21:11:55.493 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 6854...
21:11:55.536 [default-dispatcher-6][extractorRunner-ext_sentences][ExtractorRunner] DEBUG all data was sent to workers.
21:11:55.544 [default-dispatcher-6][processExecutor1][ProcessExecutor] DEBUG closing input stream
21:11:57.355 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 7145...
21:11:59.539 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 7706...
21:12:01.223 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 7968...
21:12:05.504 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 8800...
21:12:08.509 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 8819...
21:12:12.793 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 9092...
21:12:18.730 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 9130...
21:12:27.059 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 9170...
21:12:27.282 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 9287...
21:12:32.074 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 9812...
21:12:35.667 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 10161...
21:12:38.906 [Thread-8][processExecutor1][ProcessExecutor] DEBUG Parsing document 10292...
21:12:41.506 [default-dispatcher-5][extractorRunner-ext_sentences][ExtractorRunner] DEBUG adding chunk of size=2288 data store.
21:12:42.390 [default-dispatcher-5][PostgresExtractionDataStoreComponent$PostgresExtractionDataStore(akka://deepdive)][PostgresExtractionDataStoreComponent$PostgresExtractionDataStore] INFO  Writing data of to file=/private/var/folders/3p/dw7y60t50s579qmxlsl0rb6r0000gn/T/deepdive_sentences6164545240943149796.csv
21:12:43.539 [default-dispatcher-5][PostgresExtractionDataStoreComponent$PostgresExtractionDataStore(akka://deepdive)][PostgresExtractionDataStoreComponent$PostgresExtractionDataStore] INFO  Copying batch data to postgres. sql='COPY sentences(dependencies, document_id, ner_tags, pos_tags, sentence, words) FROM STDIN CSV'file='/private/var/folders/3p/dw7y60t50s579qmxlsl0rb6r0000gn/T/deepdive_sentences6164545240943149796.csv'
21:12:43.540 [][][ConnectionPool$] DEBUG Borrowed a new connection from ConnectionPool(url:jdbc:postgresql://127.0.0.1/deepdive_spouse, user:Robin)
21:12:43.848 [default-dispatcher-5][PostgresExtractionDataStoreComponent$PostgresExtractionDataStore(akka://deepdive)][PostgresExtractionDataStoreComponent$PostgresExtractionDataStore] INFO  Successfully copied batch data to postgres.
21:12:43.849 [Thread-7][processExecutor1][ProcessExecutor] DEBUG closing output stream
21:12:43.851 [default-dispatcher-5][processExecutor1][ProcessExecutor] INFO  process exited with exit_value=0
21:12:43.865 [default-dispatcher-8][extractorRunner-ext_sentences][ExtractorRunner] DEBUG worker=processExecutor1 has terminated. Waiting for 0 others.
21:12:43.865 [default-dispatcher-8][extractorRunner-ext_sentences][ExtractorRunner] INFO  All workers are done. Finishing up.
21:12:43.869 [default-dispatcher-8][extractorRunner-ext_sentences][ExtractorRunner] INFO  Shutting down
21:12:43.873 [default-dispatcher-4][profiler][Profiler] DEBUG ending report_id=ext_sentences
21:12:43.874 [default-dispatcher-5][taskManager][TaskManager] INFO  Completed task_id=ext_sentences with Success(Done!)
21:12:43.874 [default-dispatcher-5][taskManager][TaskManager] INFO  1/5 tasks eligible.
21:12:43.875 [default-dispatcher-5][taskManager][TaskManager] INFO  Tasks not_eligible: Set(shutdown, inference, report, calibration)
21:12:43.875 [default-dispatcher-5][taskManager][TaskManager] DEBUG Sending task_id=ext_people to Actor[akka://deepdive/user/extractionManager#1773597743]
21:12:43.876 [default-dispatcher-5][extractionManager][ExtractionManager$PostgresExtractionManager] INFO  Adding task_name=ext_people
21:12:43.876 [default-dispatcher-5][extractionManager][ExtractionManager$PostgresExtractionManager] INFO  executing extractorName=ext_people
21:12:43.877 [default-dispatcher-5][extractorRunner-ext_people][ExtractorRunner] INFO  waiting for task
21:12:43.877 [default-dispatcher-5][extractorRunner-ext_people][ExtractorRunner] INFO  Received task=ext_people. Executing
21:12:43.881 [default-dispatcher-5][extractorRunner-ext_people][ExtractorRunner] INFO  Executing before script.
21:12:43.885 [default-dispatcher-5][extractorRunner-ext_people][ExtractorRunner] INFO  Executing: "/Users/Robin/Documents/repos/research/deepdive/app/spouse_example/udf/before_people.sh"
21:12:43.887 [default-dispatcher-4][profiler][Profiler] DEBUG starting report_id=ext_people
21:12:43.957 [Thread-11][extractorRunner-ext_people][ExtractorRunner] INFO  NOTICE:  truncate cascades to table "has_spouse"
21:12:43.968 [Thread-10][extractorRunner-ext_people][ExtractorRunner] INFO  TRUNCATE TABLE
21:12:43.969 [default-dispatcher-5][extractorRunner-ext_people][ExtractorRunner] INFO  Starting 1 children process workers
21:12:43.969 [][][ConnectionPool$] DEBUG Borrowed a new connection from ConnectionPool(url:jdbc:postgresql://127.0.0.1/deepdive_spouse, user:Robin)
21:12:43.970 [default-dispatcher-8][extractorRunner-ext_people][ExtractorRunner] INFO  Getting data from the data store and sending it to the workers. query='DatastoreInputQuery(SELECT * FROM sentences)'
21:12:43.970 [default-dispatcher-6][processExecutor1][ProcessExecutor] INFO  started
21:12:43.971 [default-dispatcher-6][processExecutor1][ProcessExecutor] INFO  starting process with cmd="/Users/Robin/Documents/repos/research/deepdive/app/spouse_example/udf/ext_people.py" and batch_size=50000
21:12:45.074 [Thread-14][processExecutor1][ProcessExecutor] DEBUG Traceback (most recent call last):
21:12:45.075 [Thread-14][processExecutor1][ProcessExecutor] DEBUG   File "/Users/Robin/Documents/repos/research/deepdive/app/spouse_example/udf/ext_people.py", line 9, in <module>
21:12:45.075 [Thread-14][processExecutor1][ProcessExecutor] DEBUG     sentence_obj = json.loads(row)
21:12:45.076 [Thread-14][processExecutor1][ProcessExecutor] DEBUG   File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 338, in loads
21:12:45.076 [Thread-14][processExecutor1][ProcessExecutor] DEBUG     return _default_decoder.decode(s)
21:12:45.076 [Thread-14][processExecutor1][ProcessExecutor] DEBUG   File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 365, in decode
21:12:45.077 [Thread-14][processExecutor1][ProcessExecutor] DEBUG     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
21:12:45.077 [Thread-14][processExecutor1][ProcessExecutor] DEBUG   File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 381, in raw_decode
21:12:45.078 [Thread-14][processExecutor1][ProcessExecutor] DEBUG     obj, end = self.scan_once(s, idx)
21:12:45.078 [Thread-14][processExecutor1][ProcessExecutor] DEBUG UnicodeDecodeError: 'utf8' codec can't decode byte 0xca in position 13: invalid continuation byte
21:12:45.081 [default-dispatcher-6][extractorRunner-ext_people][ExtractorRunner] DEBUG adding chunk of size=1195 data store.
21:12:45.081 [default-dispatcher-8][extractorRunner-ext_people][ExtractorRunner] DEBUG all data was sent to workers.
21:12:45.082 [default-dispatcher-5][processExecutor1][ProcessExecutor] DEBUG closing input stream
21:12:45.139 [default-dispatcher-6][PostgresExtractionDataStoreComponent$PostgresExtractionDataStore(akka://deepdive)][PostgresExtractionDataStoreComponent$PostgresExtractionDataStore] INFO  Writing data of to file=/private/var/folders/3p/dw7y60t50s579qmxlsl0rb6r0000gn/T/deepdive_people_mentions3678455628136952209.csv
21:12:45.220 [][][ConnectionPool$] DEBUG Borrowed a new connection from ConnectionPool(url:jdbc:postgresql://127.0.0.1/deepdive_spouse, user:Robin)
21:12:45.221 [default-dispatcher-6][PostgresExtractionDataStoreComponent$PostgresExtractionDataStore(akka://deepdive)][PostgresExtractionDataStoreComponent$PostgresExtractionDataStore] INFO  Copying batch data to postgres. sql='COPY people_mentions(length, sentence_id, start_position, text) FROM STDIN CSV'file='/private/var/folders/3p/dw7y60t50s579qmxlsl0rb6r0000gn/T/deepdive_people_mentions3678455628136952209.csv'
21:12:45.290 [default-dispatcher-6][PostgresExtractionDataStoreComponent$PostgresExtractionDataStore(akka://deepdive)][PostgresExtractionDataStoreComponent$PostgresExtractionDataStore] INFO  Successfully copied batch data to postgres.
21:12:45.291 [Thread-13][processExecutor1][ProcessExecutor] DEBUG closing output stream
21:12:45.291 [default-dispatcher-6][processExecutor1][ProcessExecutor] INFO  process exited with exit_value=1
21:12:45.293 [default-dispatcher-2][profiler][Profiler] DEBUG ending report_id=ext_people
21:12:45.293 [default-dispatcher-6][taskManager][TaskManager] INFO  Completed task_id=ext_people with Failure(java.lang.RuntimeException: process exited with exit_code=1)
21:12:45.302 [default-dispatcher-6][taskManager][TaskManager] ERROR task=ext_people Failed: java.lang.RuntimeException: process exited with exit_code=1
21:12:45.303 [default-dispatcher-4][extractorRunner-ext_people][LocalActorRef] INFO  Message [akka.actor.Terminated] from Actor[akka://deepdive/user/extractionManager/extractorRunner-ext_people/processExecutor1#1602433297] to Actor[akka://deepdive/user/extractionManager/extractorRunner-ext_people#-520850708] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
21:12:45.303 [default-dispatcher-6][taskManager][TaskManager] ERROR Forcing shutdown
21:12:45.308 [default-dispatcher-6][taskManager][TaskManager] ERROR Cancelling task=calibration
21:12:45.309 [default-dispatcher-6][taskManager][TaskManager] ERROR Cancelling task=inference
21:12:45.309 [default-dispatcher-6][taskManager][TaskManager] INFO  1/2 tasks eligible.
21:12:45.310 [default-dispatcher-6][taskManager][TaskManager] INFO  Tasks not_eligible: Set(shutdown)
21:12:45.310 [default-dispatcher-6][taskManager][TaskManager] DEBUG Sending task_id=report to Actor[akka://deepdive/user/profiler#205768101]
21:12:45.311 [default-dispatcher-6][profiler][Profiler] DEBUG starting report_id=report
21:12:45.311 [default-dispatcher-6][profiler][Profiler] INFO  --------------------------------------------------
21:12:45.312 [default-dispatcher-6][profiler][Profiler] INFO  Summary Report
21:12:45.312 [default-dispatcher-6][profiler][Profiler] INFO  --------------------------------------------------
21:12:45.313 [default-dispatcher-6][profiler][Profiler] INFO  ext_sentences SUCCESS [201373 ms]
21:12:45.314 [default-dispatcher-6][profiler][Profiler] INFO  ext_people FAILURE [1406 ms]
21:12:45.315 [default-dispatcher-6][profiler][Profiler] INFO  --------------------------------------------------
21:12:45.315 [default-dispatcher-2][profiler][Profiler] DEBUG ending report_id=report
21:12:45.316 [default-dispatcher-4][taskManager][TaskManager] INFO  Completed task_id=report with Success(Success(()))
21:12:45.316 [default-dispatcher-4][taskManager][TaskManager] INFO  1/1 tasks eligible.
21:12:45.317 [default-dispatcher-4][taskManager][TaskManager] INFO  Tasks not_eligible: Set()
21:12:45.317 [default-dispatcher-4][taskManager][TaskManager] DEBUG Sending task_id=shutdown to Actor[akka://deepdive/user/taskManager#-460907103]
21:12:45.318 [default-dispatcher-6][profiler][Profiler] DEBUG starting report_id=shutdown
21:12:45.342 [default-dispatcher-7][taskManager][RepointableActorRef] INFO  Message [akka.dispatch.sysmsg.Terminate] from Actor[akka://deepdive/user/taskManager#-460907103] to Actor[akka://deepdive/user/taskManager#-460907103] was not delivered. [2] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
21:12:45.350 [default-dispatcher-7][EventStream][EventStream] DEBUG shutting down: StandardOutLogger started
[success] Total time: 205 s, completed Feb 5, 2014 9:12:45 PM

On Feb 5, 2014, at 9:03 PM, Zifei Shan zifei@stanford.edu wrote:

I did, but the problem is that I just cannot start the nlp_extractor.

It works for me when I do export JAVA_OPTS="-Xmx4g" in run.sh.

I'll check how low it can go...

On Feb 5, 2014, at 9:01 PM, Feiran Wang notifications@github.com wrote:

@zifeishan I only got 2g... did you remove the tables and pipelines before re-running nlp?

— Reply to this email directly or view it on GitHub.

Zifei Shan M.S. student in Computer Science, Stanford University (2015) B.S. in Computer Science, Peking University (2013)

Zifei Shan M.S. student in Computer Science, Stanford University (2015) B.S. in Computer Science, Peking University (2013)

dennybritz commented 10 years ago

Not sure what happened, closing this for now..