Closed bgyori closed 8 years ago
This appears to be an issue local to Ben's python-java setup. This text is parsed and returns results very quickly using any of the following:
1) the ReachShell: sbt 'run-main edu.arizona.sista.reach.ReachShell'
2) the Reach BioVisualizer web app: http://agathon.sista.arizona.edu:8080/odinweb/bio
3) the Reach API via Curl from the command line:
curl -XPOST -F 'text=<bg-input' 'http://agathon:8080/odinweb/api/text' > output-bg.json
where the input text is in the file 'bg-input' and,
4) the Scala example TextInJsonOut from the reach-examples GitHub repository:
sbt 'run-main com.yourorg.TextInJsonOut outfile-bg.json' < bg-input
Note that all of these programs run with at least 6G of memory, either allocated by the sbt build-file (javaOptions += "-Xmx6G") or as Java flags in the environment (JAVA_OPTS='-server -Xms1024m -Xmx6144m').
I'm running the parser locally on a text string as: ApiRuler.annotateText(text_string, 'fries'). This hangs on some inputs, while memory usage is stable and CPU usage is stably high.
An example input is the last 3 sentences of the PubMed 25338567 abstract:
Importantly, parsing each sentence independently goes through without hanging.
Some statistics on my desktop (32GB RAM, Debian): %MEM: 12, VIRT: 9973m, RES: 3.6g, %CPU: 106.4