kmadathil / sanskrit_parser

Parsers for Sanskrit / संस्कृतम्
MIT License
69 stars 21 forks source link

Use flask_restplus instead of flask_restful #60

Closed vvasuki closed 5 years ago

vvasuki commented 6 years ago

Results in auto-generation of swagger docs. Example: https://github.com/vedavaapi/vedavaapi_py_api/blob/master/vedavaapi_py_api/ullekhanam/api_v1.py documented at https://api.vedavaapi.org/py/ullekhanam/docs#!/default/get_book_list

avinashvarna commented 6 years ago

@kmadathil started the implementation with flask_restful. I have no specific choice.

Also, the UI is indeed static js + html using bootstrap. Did you have something else in mind when you said it need not be served by the flask server?

kmadathil commented 6 years ago

That the UI is served through flask for now is a development/debug feature and nothing more. Once we serve through apache or nginx, we'd only map the api urls to call the flask code, so the flask code to serve static files won't be invoked. Static files would be served directly through whichever webserver we choose.

kmadathil commented 6 years ago

"From past experience, I think that it is ideal for the flask API to be in a separate repository from this python package. Not only would changes be more manageable then, one should not need to replicate something available in pip just to run a REST api server. Do you agree? If so, I can create a new repo and work on deploying it."

Let's hold on until this is slightly better. I'd like to implement a "feedback" feature which allows someone to report a dodgy split/analysis.

vvasuki commented 6 years ago

‌ Also, the UI is indeed static js + html using bootstrap. Did you have something else in mind when you said it need not be served by the flask server?

Oh, good - as long as the UI code is such that there is no need for the UI app to be hosted from the same location as the API code - it's cool.

‌ That the UI is served through flask for now is a development/debug feature and nothing more.

contradicts

I'd like to implement a "feedback" feature which allows someone to report a dodgy split/analysis.

though :-) I think it is best to just put out a plaything and let the world interact with it even as you continue your R and D.

vvasuki commented 6 years ago

‌ I'd like to implement a "feedback" feature which allows someone to report a dodgy split/analysis.

@shreevatsa has a great model where feedback is converted to github issues prepopulated with the problematic user input. This might be convenient to follow. https://sanskritmetres.appspot.com/

avinashvarna commented 6 years ago

That's great! I found this in the github docs as well. The only problem is that it requires a github account I think. @kmadathil was thinking about just having it send an email, which should work as well.

vvasuki commented 6 years ago

github issues are the better than email to track such feedback; and the additional minor step of having to log into github has not dissuaded users from giving @shreevatsa some excellent feedback over the years.

avinashvarna commented 6 years ago

I am not fully convinced - how many people on the samskrita google group have github accounts, for example? Would they be willing to sign up just to report an issue? I think we should keep it simple from the user perspective as well, not just from the developer's perspective.

vvasuki commented 6 years ago

Would they be willing to sign up just to report an issue?

I wish we had an empirical answer :-) But my suspicion is nearly every serious user.

kmadathil commented 6 years ago

"I think it is best to just put out a plaything and let the world interact with it even as you continue your R and D." @vvasuki Ok, I'm convinced :-) How do we get this deployed on vedavaapi?

vvasuki commented 6 years ago

@vvasuki Ok, I'm convinced :-) How do we get this deployed on vedavaapi?

Cool! Are you ready to put the API in a separate repo? If so, I'll do that, add an Apache WSGI wrapper and serve from vedavaapi.

avinashvarna commented 6 years ago

Do we want to at least setup the link to help the user file a github issue by prepopulating data similar to sanskritmetres? That should be fairly easy to do.

kmadathil commented 6 years ago

Ok, I have added a basic report issue feature - sends email for now. We'll switch later. I've bumped the version to dev6 and uploaded. I've switched to flask-restplus. To do that, I've moved the development URL for static files back to localhost:5000/static, since restplus uses root for docs.

@vvasuki - I think we're good to go. You can move util/rest_api.py and util/static/* to a new repo or use them from the current repo as you wish.

vvasuki commented 6 years ago

API

I rearranged the code a little bit (to conform to REST API conventions, enable CORS etc..) and added the necessary (for deployment) wsgi wrapper.

You now run the code with sanskrit_parser/rest_api/run.py. This will yield the following major routes: http://localhost:9000 - your swagger API http://localhost:9000/sitemap

On vedavaapi server, it's at:

UI

I've decoupled it from the API server so that you can specify exactly which API you want to the UI to use: image

The UI code has been moved so that you can access it from https://kmadathil.github.io/sanskrit_parser/ui/index.html . Albeit I haven't figured out why https://kmadathil.github.io/sanskrit_parser/ui/lib/jquery.query-object-2.2.3.js returns 404. Once @kmadathil (or anyone else) figures it out, all should be fine and dandy. Compare with https://vedavaapi.github.io/ullekhanam-ui/v0/js/textract/lib/jquery.query-object-2.2.3.js which works fine.

Ideas about the above last problem:

kmadathil commented 6 years ago

I get a CORS error when I try to use https://kmadathil.github.io/sanskrit_parser/ui/index.html

Failed to load https://api.vedavaapi.org/py_skt_parser_api/sanskrit_parser/v1/tags/rAma: No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'https://kmadathil.github.io' is therefore not allowed access. The response had HTTP status code 403.

vvasuki commented 6 years ago

Problem seems to be on the github.io end. Please contrast:

image (Please ignore the outdated error console message above.)

with

image

Note the response headers.

Did you do anything special to fix the "https://kmadathil.github.io/sanskrit_parser/ui/lib/jquery.query-object-2.2.3.js returns 404" problem? Or did it start working by itself.

Perhaps one can try the below:

Move the ui folder to a new repo, serve it from github.io and see what happens. docs/.nojekyll file (required for python sphynx docs ) might be making a difference.

vvasuki commented 6 years ago

Well, I take back my conclusion above upon seeing this diff and noting that the rejection happens at the server end. Not sure what's going on.

vvasuki commented 6 years ago

Ah fixed - works now. There was a typo in the default API URL.

kmadathil commented 6 years ago

Ok, now I can get https://kmadathil.github.io/sanskrit_parser/ui/index.html working for "Split" and "Tags" options. The "Analyze" option returns a null result.

Contrast: image

With: image

kmadathil commented 6 years ago

Works locally: image

kmadathil commented 6 years ago

Can see the difference with curl too. Now why would this be? Which version of the code does the vedavaapi api use? (master)$ curl -X GET --header 'Accept: application/json' 'http://api.vedavaapi.org/py_skt_parser/sanskrit_parser/v1/analyses/astyuttarasyAm' { "devanagari": "\u0905\u0938\u094d\u0924\u094d\u092f\u0941\u0924\u094d\u0924\u0930\u0938\u094d\u092f\u093e\u092e\u094d", "analysis": {}, "input": "astyuttarasyAm" } (master)$ curl -X GET --header 'Accept: application/json' 'http://localhost:9000/sanskrit_parser/v1/analyses/astyuttarasyAm' {"devanagari": "\u0905\u0938\u094d\u0924\u094d\u092f\u0941\u0924\u094d\u0924\u0930\u0938\u094d\u092f\u093e\u092e\u094d", "input": "astyuttarasyAm", "analysis": {"\u0905\u0938\u094d\u0924\u093f_\u0909\u0924\u094d\u0924\u0930\u0938\u094d\u092f\u093e\u092e\u094d": [[["\u0905\u0938\u094d\u0924\u093f", ["\u0905\u0938\u094d#\u0967", ["\u090f\u0915\u0935\u091a\u0928\u092e\u094d", "\u092a\u094d\u0930\u0925\u092e\u092a\u0941\u0930\u0941\u0937\u0903", "\u092a\u094d\u0930\u093e\u0925\u092e\u093f\u0915\u0903", "\u0915\u0930\u094d\u0924\u0930\u093f", "\u0932\u091f\u094d"]]], ["\u0909\u0924\u094d\u0924\u0930\u0938\u094d\u092f\u093e\u092e\u094d", ["\u0909\u0924\u094d\u0924\u0930#\u0968", ["\u0938\u092a\u094d\u0924\u092e\u0940\u0935\u093f\u092d\u0915\u094d\u0924\u093f\u0903", "\u0938\u094d\u0924\u094d\u0930\u0940\u0932\u093f\u0919\u094d\u0917\u092e\u094d", "\u090f\u0915\u0935\u091a\u0928\u092e\u094d"]]]], [["\u0905\u0938\u094d\u0924\u093f", ["\u0905\u0938\u094d#\u0967", ["\u090f\u0915\u0935\u091a\u0928\u092e\u094d", "\u092a\u094d\u0930\u0925\u092e\u092a\u0941\u0930\u0941\u0937\u0903", "\u092a\u094d\u0930\u093e\u0925\u092e\u093f\u0915\u0903", "\u0915\u0930\u094d\u0924\u0930\u093f", "\u0932\u091f\u094d"]]], ["\u0909\u0924\u094d\u0924\u0930\u0938\u094d\u092f\u093e\u092e\u094d", ["\u0909\u0924\u094d\u0924\u0930#\u0967", ["\u0938\u092a\u094d\u0924\u092e\u0940\u0935\u093f\u092d\u0915\u094d\u0924\u093f\u0903", "\u0938\u094d\u0924\u094d\u0930\u0940\u0932\u093f\u0919\u094d\u0917\u092e\u094d", "\u090f\u0915\u0935\u091a\u0928\u092e\u094d"]]]]]}}

vvasuki commented 6 years ago

The code in vedavaapi matches the latest check-in into the master branch.. I wonder if your local code is somehow different?

kmadathil commented 6 years ago

I'm running the latest master code locally `(master)$ git status On branch master Your branch is up to date with 'origin/master'.

nothing to commit, working tree clean

(master)$ git log commit 0ae98002d2078b2e15abc7f4adae41da45bfb2b9 (HEAD -> master, origin/master, origin/HEAD) Author: vishvAsaH vishvas.vasuki@gmail.com Date: Sun Dec 10 07:20:53 2017 -0800

Fix URL `

vvasuki commented 6 years ago
vvasuki@vedavaapi:/home/samskritam/sanskrit_parser$ git status
On branch master
Your branch is up-to-date with 'origin/master'.
nothing to commit, working directory clean
vvasuki@vedavaapi:/home/samskritam/sanskrit_parser$ git history
git: 'history' is not a git command. See 'git --help'.
vvasuki@vedavaapi:/home/samskritam/sanskrit_parser$ git log -1
commit 3929a11cb1cb84facc058e6e38521c060ce6f55e
Author: vishvAsaH <vishvas.vasuki@gmail.com>
Date:   Sun Dec 10 07:08:25 2017 -0800

    Dont set header explicitly in WSGI

Your computer may have some data which is not being used or installed automatically on the server?

kmadathil commented 6 years ago

Can you check if this runs on the machine you've hosted the api on?

(master)$ python -m sanskrit_parser.morphological_analyzer.SanskritMorphologicalA nalyzer astyuttarasyAm Input String: astyuttarasyAm Input String in SLP1: astyuttarasyAm Start Split: 2017-12-11 10:31:51.973653 End DAG generation: 2017-12-11 10:31:51.980026 End pathfinding: 2017-12-11 10:31:51.981243 Splits: Lexical Split: [asti, uttarasyAm] Valid Morphologies [(asti, ('as#1', set([ekavacanam, kartari, praTamapuruzaH, prATamikaH, law]))), (uttarasyAm, ('uttara#2', set([strIliNgam, saptamIviBaktiH, ekavacanam])))] [(asti, ('as#1', set([ekavacanam, kartari, praTamapuruzaH, prATamikaH, law]))), (uttarasyAm, ('uttara#1', set([strIliNgam, saptamIviBaktiH, ekavacanam])))] Lexical Split: [asti, uttara, syAm] No valid morphologies for this split Lexical Split: [asti, ut, tara, syAm] No valid morphologies for this split End Morphological Analysis: 2017-12-11 10:31:52.053919

The wrapper is trivial, so if there's some data dependency it's iikely to be further down the stack.

On Mon, Dec 11, 2017 at 10:26 AM Vishvas Vasuki विश्वासः < notifications@github.com> wrote:

vvasuki@vedavaapi:/home/samskritam/sanskrit_parser$ git status On branch master Your branch is up-to-date with 'origin/master'. nothing to commit, working directory clean vvasuki@vedavaapi:/home/samskritam/sanskrit_parser$ git history git: 'history' is not a git command. See 'git --help'. vvasuki@vedavaapi:/home/samskritam/sanskrit_parser$ git log -1 commit 3929a11cb1cb84facc058e6e38521c060ce6f55e Author: vishvAsaH vishvas.vasuki@gmail.com Date: Sun Dec 10 07:08:25 2017 -0800

Dont set header explicitly in WSGI

Your computer may have some data which is not being used or installed automatically on the server?

— You are receiving this because you modified the open/close state.

Reply to this email directly, view it on GitHub https://github.com/kmadathil/sanskrit_parser/issues/60#issuecomment-350622485, or mute the thread https://github.com/notifications/unsubscribe-auth/AJRLNkMXojhbiUagptjRiKSyhY0_Rlyuks5s_LX3gaJpZM4Qz7Tu .

vvasuki commented 6 years ago

Server uses python3. Command python3 -m sanskrit_parser.morphological_analyzer.SanskritMorphologicalAnalyzer astyuttarasyAm runs for a long time, but when I interrupt I get:

    import sanskrit_parser.lexical_analyzer.SanskritLexicalAnalyzer as SanskritLexicalAnalyzer
  File "/data/home/samskritam/sanskrit_parser/sanskrit_parser/lexical_analyzer/SanskritLexicalAnalyzer.py", line 125, in <module>
    class SanskritLexicalAnalyzer(object):
  File "/data/home/samskritam/sanskrit_parser/sanskrit_parser/lexical_analyzer/SanskritLexicalAnalyzer.py", line 131, in SanskritLexicalAnalyzer
    forms  = inriaxmlwrapper.InriaXMLWrapper()
  File "/data/home/samskritam/sanskrit_parser/sanskrit_parser/util/inriaxmlwrapper.py", line 95, in __init__
    self._load_forms()        
  File "/data/home/samskritam/sanskrit_parser/sanskrit_parser/util/inriaxmlwrapper.py", line 149, in _load_forms
    self._generate_dict()
  File "/data/home/samskritam/sanskrit_parser/sanskrit_parser/util/inriaxmlwrapper.py", line 121, in _generate_dict
    self._get_files()
  File "/data/home/samskritam/sanskrit_parser/sanskrit_parser/util/inriaxmlwrapper.py", line 113, in _get_files
    for chunk in r.iter_content(chunk_size=128):
  File "/usr/local/lib/python3.5/dist-packages/requests/models.py", line 745, in generate
    for chunk in self.raw.stream(chunk_size, decode_content=True):
  File "/usr/local/lib/python3.5/dist-packages/urllib3/response.py", line 436, in stream
    data = self.read(amt=amt, decode_content=decode_content)
  File "/usr/local/lib/python3.5/dist-packages/urllib3/response.py", line 384, in read
    data = self._fp.read(amt)
  File "/usr/lib/python3.5/http/client.py", line 448, in read
    n = self.readinto(b)
  File "/usr/lib/python3.5/http/client.py", line 488, in readinto
    n = self.fp.readinto(b)
  File "/usr/lib/python3.5/socket.py", line 575, in readinto
    return self._sock.recv_into(b)
  File "/usr/lib/python3.5/ssl.py", line 929, in recv_into
    return self.read(nbytes, buffer)
  File "/usr/lib/python3.5/ssl.py", line 791, in read
    return self._sslobj.read(len, buffer)
  File "/usr/lib/python3.5/ssl.py", line 575, in read
    v = self._sslobj.read(len, buffer)
KeyboardInterrupt

PS: Wrap multiple lines of code in ```

kmadathil commented 6 years ago

Our Travis-CI testsuite runs on Python3.6. That includes MorphologicalAnalyzer tests. Not sure if 3.5 will work. (Perhaps @codito knows?)

Is it possible to run the server on Python 2.7?

On Mon, Dec 11, 2017 at 10:52 AM Vishvas Vasuki विश्वासः < notifications@github.com> wrote:

Server uses python3. Command python3 -m sanskrit_parser.morphological_analyzer.SanskritMorphologicalAnalyzer astyuttarasyAm runs for a long time, but when I interrupt I get:

import sanskrit_parser.lexical_analyzer.SanskritLexicalAnalyzer as SanskritLexicalAnalyzer

File "/data/home/samskritam/sanskrit_parser/sanskrit_parser/lexical_analyzer/SanskritLexicalAnalyzer.py", line 125, in class SanskritLexicalAnalyzer(object): File "/data/home/samskritam/sanskrit_parser/sanskrit_parser/lexical_analyzer/SanskritLexicalAnalyzer.py", line 131, in SanskritLexicalAnalyzer forms = inriaxmlwrapper.InriaXMLWrapper() File "/data/home/samskritam/sanskrit_parser/sanskrit_parser/util/inriaxmlwrapper.py", line 95, in init self._load_forms() File "/data/home/samskritam/sanskrit_parser/sanskrit_parser/util/inriaxmlwrapper.py", line 149, in _load_forms self._generate_dict() File "/data/home/samskritam/sanskrit_parser/sanskrit_parser/util/inriaxmlwrapper.py", line 121, in _generate_dict self._get_files() File "/data/home/samskritam/sanskrit_parser/sanskrit_parser/util/inriaxmlwrapper.py", line 113, in _get_files for chunk in r.iter_content(chunk_size=128): File "/usr/local/lib/python3.5/dist-packages/requests/models.py", line 745, in generate for chunk in self.raw.stream(chunk_size, decode_content=True): File "/usr/local/lib/python3.5/dist-packages/urllib3/response.py", line 436, in stream data = self.read(amt=amt, decode_content=decode_content) File "/usr/local/lib/python3.5/dist-packages/urllib3/response.py", line 384, in read data = self._fp.read(amt) File "/usr/lib/python3.5/http/client.py", line 448, in read n = self.readinto(b) File "/usr/lib/python3.5/http/client.py", line 488, in readinto n = self.fp.readinto(b) File "/usr/lib/python3.5/socket.py", line 575, in readinto return self._sock.recv_into(b) File "/usr/lib/python3.5/ssl.py", line 929, in recv_into return self.read(nbytes, buffer) File "/usr/lib/python3.5/ssl.py", line 791, in read return self._sslobj.read(len, buffer) File "/usr/lib/python3.5/ssl.py", line 575, in read v = self._sslobj.read(len, buffer) KeyboardInterrupt

PS: Wrap multiple lines of code in ```

— You are receiving this because you modified the open/close state.

Reply to this email directly, view it on GitHub https://github.com/kmadathil/sanskrit_parser/issues/60#issuecomment-350625365, or mute the thread https://github.com/notifications/unsubscribe-auth/AJRLNrcl6WpB2-Sd33LYCbclSj5bBtLGks5s_LwngaJpZM4Qz7Tu .

vvasuki commented 6 years ago

I'd rather run on 3.6+.

On my local computer, I get:

 python3.6 -m sanskrit_parser.morphological_analyzer.SanskritMorphologicalAnalyzer astyuttarasyAm
Input String: astyuttarasyAm
Input String in SLP1: astyuttarasyAm
Start Split: 2017-12-11 06:59:42.486717
Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/vvasuki/sanskrit_parser/sanskrit_parser/morphological_analyzer/SanskritMorphologicalAnalyzer.py", line 346, in <module>
    main()
  File "/home/vvasuki/sanskrit_parser/sanskrit_parser/morphological_analyzer/SanskritMorphologicalAnalyzer.py", line 326, in main
    graph=s.getSandhiSplits(i,tag=True,debug=args.debug)
  File "/home/vvasuki/sanskrit_parser/sanskrit_parser/lexical_analyzer/SanskritLexicalAnalyzer.py", line 208, in getSandhiSplits
    self.tagLexicalGraph(dag)
  File "/home/vvasuki/sanskrit_parser/sanskrit_parser/lexical_analyzer/SanskritLexicalAnalyzer.py", line 190, in tagLexicalGraph
    t=self.getLexicalTags(n)
  File "/home/vvasuki/sanskrit_parser/sanskrit_parser/lexical_analyzer/SanskritLexicalAnalyzer.py", line 147, in getLexicalTags
    tags=self.forms.get_tags(ot)
  File "/home/vvasuki/sanskrit_parser/sanskrit_parser/util/inriaxmlwrapper.py", line 178, in get_tags
    return self._xml_to_tags(word)
  File "/home/vvasuki/sanskrit_parser/sanskrit_parser/util/inriaxmlwrapper.py", line 158, in _xml_to_tags
    root = etree.parse(BytesIO(tag)).getroot()
TypeError: a bytes-like object is required, not 'str'
avinashvarna commented 6 years ago

That seems to indicate that you have data that was created using python 2.x on your machine. Can you try removing ~/.sanskrit_parser/data and rerunning ?

On Dec 11, 2017 8:00 AM, "Vishvas Vasuki विश्वासः" notifications@github.com wrote:

I'd rather run on 3.6+.

On my local computer, I get:

python3.6 -m sanskrit_parser.morphological_analyzer.SanskritMorphologicalAnalyzer astyuttarasyAm Input String: astyuttarasyAm Input String in SLP1: astyuttarasyAm Start Split: 2017-12-11 06:59:42.486717 Traceback (most recent call last): File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/usr/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/vvasuki/sanskrit_parser/sanskrit_parser/morphological_analyzer/SanskritMorphologicalAnalyzer.py", line 346, in main() File "/home/vvasuki/sanskrit_parser/sanskrit_parser/morphological_analyzer/SanskritMorphologicalAnalyzer.py", line 326, in main graph=s.getSandhiSplits(i,tag=True,debug=args.debug) File "/home/vvasuki/sanskrit_parser/sanskrit_parser/lexical_analyzer/SanskritLexicalAnalyzer.py", line 208, in getSandhiSplits self.tagLexicalGraph(dag) File "/home/vvasuki/sanskrit_parser/sanskrit_parser/lexical_analyzer/SanskritLexicalAnalyzer.py", line 190, in tagLexicalGraph t=self.getLexicalTags(n) File "/home/vvasuki/sanskrit_parser/sanskrit_parser/lexical_analyzer/SanskritLexicalAnalyzer.py", line 147, in getLexicalTags tags=self.forms.get_tags(ot) File "/home/vvasuki/sanskrit_parser/sanskrit_parser/util/inriaxmlwrapper.py", line 178, in get_tags return self._xml_to_tags(word) File "/home/vvasuki/sanskrit_parser/sanskrit_parser/util/inriaxmlwrapper.py", line 158, in _xml_to_tags root = etree.parse(BytesIO(tag)).getroot() TypeError: a bytes-like object is required, not 'str'

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/kmadathil/sanskrit_parser/issues/60#issuecomment-350748865, or mute the thread https://github.com/notifications/unsubscribe-auth/AAuTCwQnp3pfk0abe6ksJgFps3Uo2oE0ks5s_UOSgaJpZM4Qz7Tu .

vvasuki commented 6 years ago

आम् - तत्सम्यकरोत् -

python3.6 -m sanskrit_parser.morphological_analyzer.SanskritMorphologicalAnalyzer astyuttarasyAm

Input String: astyuttarasyAm
Input String in SLP1: astyuttarasyAm
Start Split: 2017-12-11 07:56:21.774403
End DAG generation: 2017-12-11 07:56:21.779757
End pathfinding: 2017-12-11 07:56:21.780411
Splits:
Lexical Split: [asti, uttarasyAm]
No valid morphologies for this split
Lexical Split: [asti, uttara, syAm]
No valid morphologies for this split
Lexical Split: [asti, ut, tara, syAm]
No valid morphologies for this split
End Morphological Analysis: 2017-12-11 07:56:21.795013

अधुना प्रकाशनयन्त्रे ३.६-प्रयोगं व्यवस्थापयितुं यतिष्ये।

avinashvarna commented 6 years ago

परन्तु उत्तरम् असाधु खलु ! "अस्ति उत्तरस्याम्" इत्यस्य morphology विद्यते । मया तु एतावता ३.६ उपयुज्य न प्रयत्तम् (२.७ उत्तरम् अधः ईक्षताम्) । प्रायः सूक्ष्मतरं परीक्षणम् आवश्यकम् ।

sanskrit_parser.morphological_analyzer.SanskritMorphologicalAnalyzer
astyuttarasyAm
Input String: astyuttarasyAm
Input String in SLP1: astyuttarasyAm
Start Split: 2017-12-11 09:13:43.868000
End DAG generation: 2017-12-11 09:13:43.875000
End pathfinding: 2017-12-11 09:13:43.877000
Splits:
Lexical Split: [asti, uttarasyAm]
Valid Morphologies
[(asti, ('as#1', set([ekavacanam, law, prATamikaH, kartari,
praTamapuruzaH]))), (uttarasyAm, ('uttara#2', set([ekavacanam,
saptamIviBaktiH, strIliNgam])))]
[(asti, ('as#1', set([ekavacanam, law, prATamikaH, kartari,
praTamapuruzaH]))), (uttarasyAm, ('uttara#1', set([ekavacanam,
saptamIviBaktiH, strIliNgam])))]
Lexical Split: [asti, uttara, syAm]
No valid morphologies for this split
Lexical Split: [asti, ut, tara, syAm]
No valid morphologies for this split
End Morphological Analysis: 2017-12-11 09:13:43.971000

On Mon, Dec 11, 2017 at 8:59 AM, Vishvas Vasuki विश्वासः < notifications@github.com> wrote:

आम् - तत्सम्यकरोत् -

python3.6 -m sanskrit_parser.morphological_analyzer.SanskritMorphologicalAnalyzer astyuttarasyAm

Input String: astyuttarasyAm Input String in SLP1: astyuttarasyAm Start Split: 2017-12-11 07:56:21.774403 End DAG generation: 2017-12-11 07:56:21.779757 End pathfinding: 2017-12-11 07:56:21.780411 Splits: Lexical Split: [asti, uttarasyAm] No valid morphologies for this split Lexical Split: [asti, uttara, syAm] No valid morphologies for this split Lexical Split: [asti, ut, tara, syAm] No valid morphologies for this split End Morphological Analysis: 2017-12-11 07:56:21.795013

अधुना प्रकाशनयन्त्रे ३.६-प्रयोगं व्यवस्थापयितुं यतिष्ये।

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/kmadathil/sanskrit_parser/issues/60#issuecomment-350767578, or mute the thread https://github.com/notifications/unsubscribe-auth/AAuTC5_Pb-y1Mjwckq7-9hfK42i-lwL5ks5s_VFagaJpZM4Qz7Tu .

vvasuki commented 6 years ago

While 3.6 command line succeeds, the API call still fails (both on the server, which now uses 3.6 instead of 3.5 and my local computer).

Logs - https://pastebin.com/raw/54m2Btbf

avinashvarna commented 6 years ago

Looks like an issue with the DhatuWrapper, since it complains about not finding dhAtu 'as'? I also see

DEBUG: 2017-12-11 14:22:50,988 {DhatuWrapper.py:48}: Parsing files into dict for faster lookup 
DEBUG: 2017-12-11 14:22:50,989 {DhatuWrapper.py:57}: Found dhatu tsv headers: ['404: Not Found'] 
DEBUG: 2017-12-11 14:22:50,989 {DhatuWrapper.py:65}: Saved dhatus database 

Looks like a problem downloading the data on py 3.x Even the command line did not really give the right results, probably for the same reason.

kmadathil commented 6 years ago

Note that 3.6 works well on Travis-CI (including this data download I suppose), so I suspect something that's machine config specific here. Needless to say, we have a fix to do to recognize and report this particular case correctly.

On Tue, Dec 12, 2017 at 7:24 AM Avinash Varna notifications@github.com wrote:

Looks like an issue with the DhatuWrapper, since it complains about not finding dhAtu 'as'? I also see

DEBUG: 2017-12-11 14:22:50,988 {DhatuWrapper.py:48}: Parsing files into dict for faster lookup DEBUG: 2017-12-11 14:22:50,989 {DhatuWrapper.py:57}: Found dhatu tsv headers: ['404: Not Found'] DEBUG: 2017-12-11 14:22:50,989 {DhatuWrapper.py:65}: Saved dhatus database

Looks like a problem downloading the data on py 3.x Even the command line did not really give the right results, probably for the same reason.

— You are receiving this because you modified the open/close state.

Reply to this email directly, view it on GitHub https://github.com/kmadathil/sanskrit_parser/issues/60#issuecomment-350919516, or mute the thread https://github.com/notifications/unsubscribe-auth/AJRLNlWiwOgRXSctFcmIn9Sgkh8_bkT7ks5s_dzNgaJpZM4Qz7Tu .

kmadathil commented 6 years ago

Ah - the file itself throws a 404 now. It's nothing to do with python version or local setup. I was attempting to download the kRShNamAchArya dhAtupATha, which seems to have vanished?

Travis-CI passes because we have no tests for MorphologicalAnalyzer. Stupid me.

kmadathil commented 6 years ago

commit 6767113 pulls the file from its new (?) location. @vvasuki - is this the same ?

vvasuki commented 6 years ago

ah yes :-) ~

you're directly depending on such online resources ? they're roughly stable, but are subject to moves - so unless there's an explicit promise of stability it's better to maintain your copy in a separate repo..

kmadathil commented 6 years ago

I see what you mean. I'll move a copy to our repo and change the code to download that instead.

How do we keep the code on the server current?

vvasuki commented 6 years ago

I don't think that you necessarily need to download the data - maintaining a copy online which you control would suffice. Just make a repo or directory in your current repo for all such data.

Regarding keeping the data current - I think it would need to be done manually, once in a while (for example, as and when users report shortcomings). One can think of cleverer ways of mechanically making periodic diffs, but I don't think it's worth the effort.

vvasuki commented 6 years ago

How do we keep the code on the server current?

If you're talking about the vedavaapi server - the backend code will not be current unless I manually sync and restart the server. (It should anyway not be "current", but should correspond to the "latest stable release".) I'll do this periodically - just (re)open an issue created for that purpose and assign it to me.

kmadathil commented 6 years ago

@vvasuki : Presuming you can run the server on Python 3.6, can you sync to the latest master and restart? If this works for morphological analyzer (as it should, now that the dhAtupAtha data is downloaded from this repo), I'll make a pip release of that version of code

vvasuki commented 6 years ago

Erroring out -

[Wed Dec 13 20:28:53.241770 2017] [wsgi:error] [pid 2661] DEBUG: 2017-12-13 20:28:53,241 {DhatuWrapper.py:51}: Parsing files into dict f
or faster lookup 
[Wed Dec 13 20:28:53.242008 2017] [wsgi:error] [pid 2661] [remote 24.23.143.72:55882] mod_wsgi (pid=2661): Target WSGI script '/home/sam
skritam/sanskrit_parser/wsgi/wsgi_app.py' cannot be loaded as Python module.
[Wed Dec 13 20:28:53.242048 2017] [wsgi:error] [pid 2661] [remote 24.23.143.72:55882] mod_wsgi (pid=2661): Exception occurred processing
 WSGI script '/home/samskritam/sanskrit_parser/wsgi/wsgi_app.py'.
[Wed Dec 13 20:28:53.242264 2017] [wsgi:error] [pid 2661] [remote 24.23.143.72:55882] Traceback (most recent call last):
[Wed Dec 13 20:28:53.242345 2017] [wsgi:error] [pid 2661] [remote 24.23.143.72:55882]   File "/home/samskritam/sanskrit_parser/wsgi/wsgi
_app.py", line 18, in <module>
[Wed Dec 13 20:28:53.242355 2017] [wsgi:error] [pid 2661] [remote 24.23.143.72:55882]     from sanskrit_parser.rest_api import run
[Wed Dec 13 20:28:53.242372 2017] [wsgi:error] [pid 2661] [remote 24.23.143.72:55882]   File "/home/samskritam/sanskrit_parser/sanskrit_
parser/rest_api/run.py", line 10, in <module>
[Wed Dec 13 20:28:53.242379 2017] [wsgi:error] [pid 2661] [remote 24.23.143.72:55882]     from sanskrit_parser.rest_api import api_v1
[Wed Dec 13 20:28:53.242394 2017] [wsgi:error] [pid 2661] [remote 24.23.143.72:55882]   File "/home/samskritam/sanskrit_parser/sanskrit_
parser/rest_api/api_v1.py", line 8, in <module>
[Wed Dec 13 20:28:53.242402 2017] [wsgi:error] [pid 2661] [remote 24.23.143.72:55882]     from sanskrit_parser.morphological_analyzer.Sa
nskritMorphologicalAnalyzer import SanskritMorphologicalAnalyzer
[Wed Dec 13 20:28:53.242417 2017] [wsgi:error] [pid 2661] [remote 24.23.143.72:55882]   File "/home/samskritam/sanskrit_parser/sanskrit_
parser/morphological_analyzer/SanskritMorphologicalAnalyzer.py", line 20, in <module>
[Wed Dec 13 20:28:53.242425 2017] [wsgi:error] [pid 2661] [remote 24.23.143.72:55882]     dw=DhatuWrapper()
[Wed Dec 13 20:28:53.242450 2017] [wsgi:error] [pid 2661] [remote 24.23.143.72:55882]   File "/home/samskritam/sanskrit_parser/sanskrit_parser/util/DhatuWrapper.py", line 32, in __init__
[Wed Dec 13 20:28:53.242458 2017] [wsgi:error] [pid 2661] [remote 24.23.143.72:55882]     self._generate_db()
[Wed Dec 13 20:28:53.242472 2017] [wsgi:error] [pid 2661] [remote 24.23.143.72:55882]   File "/home/samskritam/sanskrit_parser/sanskrit_parser/util/DhatuWrapper.py", line 55, in _generate_db
[Wed Dec 13 20:28:53.242479 2017] [wsgi:error] [pid 2661] [remote 24.23.143.72:55882]     for irx,row in enumerate(reader):
[Wed Dec 13 20:28:53.242494 2017] [wsgi:error] [pid 2661] [remote 24.23.143.72:55882]   File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
[Wed Dec 13 20:28:53.242501 2017] [wsgi:error] [pid 2661] [remote 24.23.143.72:55882]     return codecs.ascii_decode(input, self.errors)[0]
[Wed Dec 13 20:28:53.242530 2017] [wsgi:error] [pid 2661] [remote 24.23.143.72:55882] UnicodeDecodeError: 'ascii' codec can't decode byte 0xe0 in position 0: ordinal not in range(128)
vvasuki commented 6 years ago

Added a basic rest api test, which should fail - https://travis-ci.org/kmadathil/sanskrit_parser

codito commented 6 years ago

It looks like DhatuWrapper is not py3.6 compatible. Have created a PR for this https://github.com/kmadathil/sanskrit_parser/pull/68.

I think we should bundle the data with sanskrit_parser package. Downloading from the web will create several incompatibilities:

Downloading from web allows auto update of data, however if user requires newer capabilities they could just update to newer version of sanskrit_parser.

vvasuki commented 6 years ago

I faced a similar problem with my dict installer app. The simple solution was to clearly version the data published and indexed at a certain static location; to remember the version downloaded, and to check for updates at every run. In my case, for backward compatibility, I put the version in the filename - but ideally, metadata for each data-chunk should be in a separate json in a canonical online data index.

avinashvarna commented 6 years ago

Strangely, test_rest_api.py seems to have passed on travis on py3.6? https://travis-ci.org/kmadathil/sanskrit_parser/jobs/316192991

kmadathil commented 6 years ago

The 3.6 issues have been fixed. @vvasuki - can you please update the server to run the latest code? Please check if you're able to access through the ui from https://kmadathil.github.io/sanskrit_parser/ui/index.html

vvasuki commented 6 years ago

The rest api test (see my recent cl) - still fails on my computer..

vvasuki commented 6 years ago

On the server, I still get this strange error - https://pastebin.com/raw/FzKrmHqc

vvasuki@vedavaapi:/home/samskritam/sanskrit_parser$ git log -1
commit 69410143a1e244acb3dc87c430ce33122400bc76 (HEAD -> master, origin/master, origin/HEAD)
Author: Karthik Madathil <karthik.m@atonarp.com>
Date:   Tue Dec 19 14:07:35 2017 +0530

    Added strict_io=False in rest api output
avinashvarna commented 6 years ago

Gotta love the differences between py3 and py2. I fixed the issue on the server which runs py3 and that ended up breaking the travis build on py2.7. Looks like there are some differences in how codecs.open and open work on py2 vs py3 (they seem to be the inverse of each other for a given py version). As a step towards #61, I changed DhatuWrapper to directly download the .json from our git repo, instead of downloading the tsv and converting it to json locally. That seems to be working on both py2 and py3. We can add a script in the data folder to convert the .tsv to .json if we need to update it in the future.

On the bright side, the API server works! You can test it at https://kmadathil.github.io/sanskrit_parser/ui/index.html