Closed Jyoti1009 closed 7 years ago
Can you give some more context (/flask code)?
Here is the flask code:
@app.route("/")
def default():
if 'sentence' in request.args and request.args.get('sentence', '') != "" and request.args.get('getdate', '') == "True":
text = request.args.get('sentence', '')
return jsonify({"result": getdate.getDate(text)})
And here is the getdate.getDate(text) function: (The function is after the lines of code to load the jar files)
def getDate(test_case):
date_obj = "today"
test_case = re.sub(r"[,-.;@#?!&$]+\ *", " ", test_case)
result = sutime.parse(test_case)
return result
@FraBle Hi Again, I have been trying to diagnose the problem and one guess that I have is probably the JRE is crashing due to memory error. This would mean that probably flask is responsible for memory allocation to the JVM which is less than the required memory as the code runs fine out of flask. I tried different measures to increase the memory size of the JVM but it did not help. Could you please provide a resolution that you feel might work here as I am really stuck at this point. Thanks in advance.
The following code is running fine for me:
import os
import re
from flask import Flask, request
from flask.json import jsonify
from sutime import SUTime
app = Flask(__name__)
_jar_files = os.path.join(os.path.dirname(__file__), 'jars')
SUTIME = SUTime(jars=_jar_files, mark_time_ranges=True)
def get_date(text):
text = re.sub(r'[,-.;@#?!&$]+\ *', ' ', text)
result = SUTIME.parse(text)
return result
@app.route("/")
def default():
if request.args.get('sentence', '') and request.args.get('getdate', '') == 'True':
text = request.args.get('sentence', '')
return jsonify({'result': get_date(text)})
else:
return jsonify({'error': 'necessary parameters missing'})
if __name__ == '__main__':
app.run()
Can you give me more context on your environment and system? I'm running the test with Python 2.7.13, Java 1.8.0_121, MacBook Pro (15-inch, 2016), 16GB RAM, macOS Sierra
The following code is kinda ugly but simulates memory restrictions:
import os
import re
import threading
import socket
from flask import Flask, request
from flask.json import jsonify
import jpype
from sutime import SUTime
socket.setdefaulttimeout(15)
def create_classpath(path):
jars = []
for top, dirs, files in os.walk(path):
for file_name in files:
if file_name.endswith('.jar'):
jars.append(os.path.join(top, file_name))
return os.pathsep.join(jars)
app = Flask(__name__)
LOCK = threading.Lock()
JAR_FILES = os.path.join(os.path.dirname(__file__), 'jars')
CLASSPATH = create_classpath(JAR_FILES)
MINIMUM_HEAP_SIZE='128m'
MAXIMUM_HEAP_SIZE='512m'
def start_jvm(minimum_heap_size, maximum_heap_size):
jvm_options = [
'-Xms{minimum_heap_size}'.format(minimum_heap_size=minimum_heap_size),
'-Xmx{maximum_heap_size}'.format(maximum_heap_size=maximum_heap_size),
'-Djava.class.path={classpath}'.format(
classpath=CLASSPATH)
]
if jpype.isJVMStarted() is not 1:
print('starting JVM')
print(jpype.getDefaultJVMPath())
print(jvm_options)
jpype.startJVM(
jpype.getDefaultJVMPath(),
*jvm_options
)
def start_sutime(minimum_heap_size, maximum_heap_size):
start_jvm(minimum_heap_size, maximum_heap_size)
try:
if (threading.activeCount() > 1 and
jpype.isThreadAttachedToJVM() is not 1):
jpype.attachThreadToJVM()
LOCK.acquire()
finally:
LOCK.release()
start_sutime(MINIMUM_HEAP_SIZE, MAXIMUM_HEAP_SIZE)
SUTIME = SUTime(jvm_started=True, mark_time_ranges=True)
def get_date(text):
text = re.sub(r'[,-.;@#?!&$]+\ *', ' ', text)
result = SUTIME.parse(text)
return result
@app.route("/")
def default():
if request.args.get('sentence', '') and request.args.get('getdate', '') == 'True':
text = request.args.get('sentence', '')
return jsonify({'result': get_date(text)})
else:
return jsonify({'error': 'necessary parameters missing'})
if __name__ == '__main__':
app.run()
Which results in the following stack trace:
starting JVM
/Library/Java/JavaVirtualMachines/jdk1.8.0_121.jdk/Contents/Home/jre/lib/jli/libjli.dylib
['-Xms128m', '-Xmx512m', '-Djava.class.path=jars/ejml-0.23.jar:jars/gson-2.7.jar:jars/javax.json-api-1.0.jar:jars/jaxb-api-2.2.7.jar:jars/joda-time-2.9.jar:jars/jollyday-0.4.7.jar:jars/slf4j-api-1.7.12.jar:jars/slf4j-simple-1.7.21.jar:jars/stanford-corenlp-3.6.0-models.jar:jars/stanford-corenlp-3.6.0.jar:jars/stanford-corenlp-sutime-python-1.0.0.jar:jars/xalan-2.7.0.jar:jars/xercesImpl-2.8.0.jar:jars/xml-apis-1.3.03.jar:jars/xom-1.2.10.jar']
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Registering annotator sutime with class edu.stanford.nlp.time.TimeAnnotator
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.TokenizerAnnotator - TokenizerAnnotator: No tokenizer type provided. Defaulting to PTBTokenizer.
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator pos
Reading POS tagger model from edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger ... done [0.9 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner
sutime.includeRange=false
Unknown property: |sutime.includeRange|
sutime.markTimeRanges=true
Unknown property: |sutime.markTimeRanges|
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... Unknown property: |sutime.includeRange|
Unknown property: |sutime.markTimeRanges|
done [1.6 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... Unknown property: |sutime.includeRange|
Unknown property: |sutime.markTimeRanges|
done [2.0 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... Traceback (most recent call last):
File "app.py", line 57, in <module>
SUTIME = SUTime(jvm_started=True, mark_time_ranges=True)
File "/Users/frank/.virtualenvs/sutime-flask/lib/python2.7/site-packages/sutime/sutime.py", line 59, in __init__
self.mark_time_ranges, self.include_range)
File "/Users/frank/.virtualenvs/sutime-flask/lib/python2.7/site-packages/jpype/_jclass.py", line 86, in _javaInit
*args)
jpype._jexception.OutOfMemoryErrorPyRaisable: java.lang.OutOfMemoryError: GC overhead limit exceeded
According to your screenshot, you're running on Ubuntu 16.04 with OpenJDK on 64bit. Could you try to run it with Oracle Java 8?
Sure, let me try and get back! Thanks a lot for the response.
@FraBle I found the source of the problem. The issue is with the debug mode. If I turn on the debug mode in Flask, it throws the above error as I reported. Else everything works fine. I am removing the debug mode for now to continue with my task. Let me know if you would like to investigate further or if I should close the issue.
It might be related to https://github.com/originell/jpype/issues/211
Restarts also don't seem to be supported by JVM: https://github.com/originell/jpype/issues/84#issuecomment-157680233
It could also be the socket.setdefaulttimeout(15)
https://github.com/xlcnd/isbnlib/issues/43#issuecomment-226468157, but removing it didn't change anything :(
I will then keep an eye on the above issues in that case. Thanks for all the help. Closing the issue here now. :)
You're welcome! :)
I have written a script which parses a sentence and works perfectly fine on command line. However, when I use the same function inside a flask app, it crashes on the line sutime.parse(). Here is the screenshot:
Kindly help me solve this!