ODM2 / WOFpy

A server-side implementation of CUAHSI's Water One Flow service stack in Python.
http://odm2.github.io/WOFpy/
9 stars 9 forks source link

Completing Luquillo CZO WOFpy endpoint integration into CUAHSI WS #219

Closed emiliom closed 6 years ago

emiliom commented 6 years ago

@martinseul, I'm following up on our emails last week about our common goal to identify and fix the remaining issue(s) preventing complete functioning of @miguelcleon's Luquillo CZO WOFpy endpoint in the CUAHSI Water Services. The homepage to that WOFpy endpoint is at http://odm2admin.cuahsi.org/wofpy/odm2lczo/, unless @miguelcleon has changed it. It looks like the test network page at qa-hiscentral is http://qa-hiscentral.cuahsi.org/pub_network.aspx?n=100002, and the corresponding test service is at http://qa-hiscentral.cuahsi.org/testpage.aspx?n=100002

For reference, we've been discussing this for some time at WOFpy issue #196. The last comments were from @lsetiawan (Don) and I on Nov 15:

Based on conversation with Martin, The problem seems to lie on tags not being closed properly for empty attributes. For example <qualityControlLevelCode/> should be <qualityControlLevelCode></qualityControlLevelCode> for the xml to be validated.

and

I spoke with Martin a bit about this. What we're doing is not wrong (ie, <qualityControlLevelCode/> is not incorrect, as demonstrated by our schema validation), but it sounds like there's a client parsing expectation on their end.

As discussed via email last week, we'll use this issue to track progress. Thanks!

miguelcleon commented 6 years ago

@emiliom yes the WOFpy endpoint homepage is still at http://odm2admin.cuahsi.org/wofpy/odm2lczo/

Is there an expectation that CUAHSI will change their parser or will @lsetiawan change the closing tags?

emiliom commented 6 years ago

Is there an expectation that CUAHSI will change their parser or will @lsetiawan change the closing tags?

Possibly a combination of both? But we're definitely prepared to do the latter. I'm waiting to hear from @martinseul exactly what the issues are, though.

miguelcleon commented 6 years ago

From Martin:

That is a fix that they will need to implement on their side, a fix on my side is not easy as the app relies on a .net xml parser that I don’t control.

I can run test once their XML is updated.

Sorry if that was not clear.

Miguel:

Ok, so just to confirm all tags need to be ended like <qualityControlLevelCode></qualityControlLevelCode>

From Martin:

Yes that’s the way it should be.

lsetiawan commented 6 years ago

Thanks @miguelcleon. I am currently looking into changing the tags to close. It seems to be complex, but I'm working on it. I'll let you know as soon as I come up with a solution. Thanks.

miguelcleon commented 6 years ago

@lsetiawan sounds good. Thank you!

miguelcleon commented 6 years ago

@lsetiawan any luck with this?

lsetiawan commented 6 years ago

@miguelcleon Not yet. I won't be able to work on this until tomorrow. Thanks.

miguelcleon commented 6 years ago

ok

lsetiawan commented 6 years ago

Alright, so after a lot of research. It seems like the problem lies on lxml library actually making empty value tags into a self closing tag. For example, if MethodLink is a blank string '', wofpy actually creates an xml tag <methodLink></methodLink>. However, this string gets turned into an lxml Element object by Spyne. In this object, the empty value tags turned into a self-closing tag <methodLink/>. I tried this manually:

>>> from lxml import etree
>>> xmlstring = '<methodLink></methodLink>'
>>> root = etree.fromstring(xmlstring)
>>> etree.tostring(root)
'<methodLink/>'

Right now one solution that I found to get the full '' is to replace '' with a space ' '. This in turn will make <methodLink> </methodLink>. Not sure if this is an okay solution. I need your input. Thanks!

emiliom commented 6 years ago

Thank you, @lsetiawan. I won't be able to help or comment on this until after Christmas. @martinseul and @horsburgh, can you provide input on Don's question/suggestion? Also, Don, if you can reach out to Python lxml/etree experts to see if you can find an ideal solution to this, that'd be great.

miguelcleon commented 6 years ago

I'll email @martinseul , it's possible the space might cause a problem, but maybe not.

miguelcleon commented 6 years ago

@lsetiawan are there cases where a self closing tag is created where a number might be expected? Or are these being generated only for charcter fields?

CUAHSIs parser might treat them different.

lsetiawan commented 6 years ago

If a number is empty, the value getting passed is probably None therefore, these tags isn't even included. So i guess these are being generated only for character fields. Thanks.

miguelcleon commented 6 years ago

Ok, I looked over some of the xml being generated by WOFpy and I didn't see any self closing tags for numbers. We can probably assume they aren't generated.

horsburgh commented 6 years ago

This is something I've always wondered about in the implementation of WOFPy, the CUAHSI WaterOneFlow web services, and WaterML. I suspect that David Valentine may have used a convention for this in his original development of WaterOneFlow, which formed the basis for the services that CUAHSI HIS uses. For the WaterOneFlow services I am hosting, I am using the last version of code that Valentine wrote before the end of the HIS project, and they all seem to work with both HIS Central and the WaterML R package, so the problem seems to be specific to WOFPy.

Since WOFPy is a different code base using different tools, the convention/assumption may not be the same. There are potentially three conventions, right?:

  1. - opening and closing tags with no value
  2. - self closing tag
  3. Omitting the methodLink element when there is no value

I don't think the right solution is to put a space in between the tags in 1.

I think we have to get some info from @martinseul because all three of these should probably be valid given that methodLink is not a required attribute. However, if there is a convention that the CUAHSI HIS Central is expecting, he's can tell us that, and we can either adopt that convention or try to talk them into making a change on their end.

miguelcleon commented 6 years ago

@lsetiawan you can do this:

from lxml import etree
import lxml.html
xmlstring = '<methodLink></methodLink>'
root = etree.fromstring(xmlstring)
print(lxml.html.tostring(root))

<methodLink></methodLink>

Would that fix the problem?

miguelcleon commented 6 years ago

@horsburgh Martin said he doesn't control the parser and can't make updates. I agree that having tags with spaces in between is not a great solution.

lsetiawan commented 6 years ago

@miguelcleon Your solution would have to be implemented within Spyne itself which I do not have control over.

The example I gave was simply a demo, not something that is in WOFpy code.

miguelcleon commented 6 years ago

@lsetiawan right, dang, we might have to go with adding the space. I emailed Martin.

lsetiawan commented 6 years ago

I looked at the WaterML 1.0 GetSiteInfo Response, and it seems like there are some self-closing tags. This was able to be parsed by WaterML R. Though this is the REST response, I'll check and see what the actual SOAP response is.

<Source sourceID="64">
<Organization>LimnoTech</Organization>
<SourceDescription>Water Environment | Scientists Engineers</SourceDescription>
<Metadata/>
<SourceLink/>
</Source>
lsetiawan commented 6 years ago

So, looking at the response SOAP XML via suds-jurko library. It seems like the tags are actually there. No self-closing.

<method methodID="2">                    
<methodCode>2</methodCode>                    
<methodLink></methodLink>               
</method>

Python code:

from suds.client import Client
client = Client('http://data.envirodiy.org/wofpy/soap/cuahsi_1_1/.wsdl')
response = client.service.GetSiteInfo(site='envirodiy:160065_Limno_Crossroads')

I guess the xml becomes self-closing when it is REST.

miguelcleon commented 6 years ago

Maybe we want to test adding the space between the tags? you could just create a separate branch for that. After we test that, we can try to get CUAHSI to fix the parser.

miguelcleon commented 6 years ago

unless @lsetiawan has another idea for what to do.

miguelcleon commented 6 years ago

From Martin:

Having spaces is not an ideal solution but might work as a workaround for now. If you point me to a dev service I could test or you could update the service we already have and I could run some tests. If we can isolate the problem to the tags than we can focus on this issue going forward. Could you also include a sample request for data values from wofpy there might be an additional problem forming the right request.

My Response:

Hi Martin,

Yeah I agree, it is not ideal to use the space. Yeah, my service is slightly behind, a manually applied all of the bug fixes though as we were testing them. I double checked with the release notes, looks like I’m just missing the much improved landing page.

But Anthony / Emilio do have an EnviroDIY WOFpy service which is completely up to date so it would be a good idea to try that http://data.envirodiy.org/wofpy/

A sample request for data:

http://data.envirodiy.org/wofpy/rest/1_1/GetValues?location=envirodiy:160065_Limno_Crossroads&variable=envirodiy:EnviroDIY_Mayfly_Temp&startDate=2017-01-01T00:00:00&endDate=2017-08-01T02:30:00

My service again is here: http://odm2admin.cuahsi.org/wofpy/odm2lczo/

Here is a sample request for data from my service:

http://odm2admin.cuahsi.org/wofpy/odm2lczo/rest/1_1/GetValues?location=odm2lczo:Rio%20Icacos%20Trib-IO&variable=odm2lczo:DO%20Concentration&startDate=2016-12-01T12:00:00&endDate=2016-12-01T14:30:00

miguelcleon commented 6 years ago

@lsetiawan

From Martin:

Also most of our HIS infrastructure is still based on soap. do you have a request body as well so I can test with fiddler?

miguelcleon commented 6 years ago

@martinseul was able to register the EnviroDIY service on http://qa-hiswebclient.azurewebsites.net/

We need to check if the metadata, database fields in download and catalog and service statistics are correct, do additional testing and evaluate performance.

@lsetiawan can you take a look? you should select the envirodiy data service. Then you can select the time series like:

image

miguelcleon commented 6 years ago

@lsetiawan the self closing tags are still a problem though. The EnviroDIY data doesn't include any fields that are empty, of those that end up in the XML at least. Can you confirm that? I've filled in the fields in my database that were empty so Martin can run the harvesting again and we can see if mine will work.

lsetiawan commented 6 years ago

@miguelcleon Awesome! Yea. I just tested it. And it works! I was able to download the actual data just fine.

@lsetiawan the self closing tags are still a problem though. The EnviroDIY data doesn't include any fields that are empty, of those that end up in the XML at least. Can you confirm that?

I looked at the csv and xml for Beth_office temperature. I see that sourceLink is empty in xml, and it's filled with unknown within the csv. It seems to be working for me.

miguelcleon commented 6 years ago

Huh, maybe it doesn't try to ingest that field or something.

miguelcleon commented 6 years ago

@lsetiawan do the number of records match the total in your database? It probably should match the number of records in the timeseriesresultvalues table I would think.

lsetiawan commented 6 years ago

Unfortunately I can't check that. I don't have access to the database. Though comparing the xml response and the csv. I get the same number or record.

miguelcleon commented 6 years ago

Ok, that should mean that HIS central ingested all of the data then.

lsetiawan commented 6 years ago

So did they provide a fix with their parser? How did it suddenly work?

miguelcleon commented 6 years ago

the EnviroDIY service is working and the LCZO one is still not working. I'm not sure what the difference is. My guess is that some required fields were not filled in on the LCZO service and that is the problem instead of the self closing tag being the problem. I updated some records so they would be complete. I asked Martin to re-run the harvester to see if that could be the issue.

I had a missing variableDescription which is a required field in ODM2 and perhaps that is why the harvester choked.

miguelcleon commented 6 years ago

@lsetiawan I'm trying to upgrade my installation to the latest release, I'm getting an error:

NoOptionError: No option 'urlpath in section: 'WOF'

It appears I need to add a urlpath parameter in my odm2_config_timeseries.cfg file but I'm not sure what the urlpath should be I tried a couple of things but I'm shooting in the dark. What sort of urlpath is needed? Where is a template for odm2_config_timeseries.cfg ? Thanks in advance.

lsetiawan commented 6 years ago

URLPATH can be anything so it's different than your network code. See https://github.com/ODM2/WOFpy/blob/master/wof/examples/flask/odm2/timeseries/odm2_config_timeseries.cfg#L4. Hope that helps! Thanks.

miguelcleon commented 6 years ago

hmm so where I had http://odm2admin.cuahsi.org/wofpy/odm2lczo/rest_1_1/

I'd think the URLPATH would be either

urlpath: wofpy

or

urlpath: wofpy/odm2lczo

but neither worked, I get 404 errors with both settings.

lsetiawan commented 6 years ago

When you spin it up what are all your url options? Also what port are you running this in? Are you setting up in Apache? If so, did you register that path to Apache?

miguelcleon commented 6 years ago

I am also getting another error

Exception AttributeError: "'NoneType' object has no attribute 'py2k'" in <bound method Odm2Dao.del of

miguelcleon commented 6 years ago

I upgraded the installation I had working from before. So yes, it was setup correctly in apache. I didn't specify a port so it should just be the http port 80, I'm pretty sure

miguelcleon commented 6 years ago

I didn't change any of the other URL options.

miguelcleon commented 6 years ago

My odm2_config_timeseries.cfg file looks like:


[WOF]
Network: ODM2LCZO
Vocabulary: ODM2LCZO
Menu_Group_Name: ODM2
Service_WSDL: http://127.0.0.1:8080/soap/wateroneflow.wsdl
Timezone: 00:00
TimezoneAbbreviation: GMT
urlpath: wofpy/odm2lczo

[Default_Params]
Site: Rio Icacos Trib-IO
Variable: DO Concentration
StartDate: 2016-12-01T12:00:00
EndDate: 2016-12-01T14:30:00

[WOF_1_1]
Service_WSDL: http://127.0.0.1:8080/soap/wateroneflow_1_1.wsdl

[Default_Params_1_1]
West: -114
South: 40
East: -110
North: 42

[WOFPY]
Templates: ../../../../../wof/flask/templates

[Database]
# The name of a file containing the Connection String eg: private.connection which has: mysql://username:password@localhost/database
Connection_String: xxxxxx
lsetiawan commented 6 years ago

http port 80, I'm pretty sure

By default, wofpy maps to port 8080. Hmm...

From your config file, try uppercasing "urlpath" to "URLPATH". I don't remember if I made that case insensitive or not. Maybe change the actual value to something simpler for now to odm2lczowofpy or something.

lsetiawan commented 6 years ago

I've just looked at the code real quick and indeed the settings keywords are case sensitive.

Please refer to https://github.com/ODM2/WOFpy/blob/master/wof/examples/flask/odm2/timeseries/odm2_config_timeseries.cfg

miguelcleon commented 6 years ago

Ok, I tried it with uppercase, but I'm not getting anything different.

lsetiawan commented 6 years ago

Is it reading the correct .cfg file? If you run it in development mode what happens?

lsetiawan commented 6 years ago

Anyways, I can't spend much more time on this today. I'll be available tomorrow to troubleshoot further. Sorry. Good luck. Let me know how it goes.

miguelcleon commented 6 years ago

No luck yet, I'm off tomorrow returning the third so this will have to wait.

miguelcleon commented 6 years ago

I tried reinstalling wofpy now Im getting an error in spyne. @lsetiawan any idea what to do?


[Wed Feb 07 17:43:36.366392 2018] [wsgi:error] [pid 76625:tid 139899760125696] [remote 100.14.198.215:33266]   File "/var/www/wofpy/singlerunserver.py", line 13, in <module>
[Wed Feb 07 17:43:36.366410 2018] [wsgi:error] [pid 76625:tid 139899760125696] [remote 100.14.198.215:33266]     import wof.flask
[Wed Feb 07 17:43:36.366421 2018] [wsgi:error] [pid 76625:tid 139899760125696] [remote 100.14.198.215:33266]   File "/home/azureadmin/miniconda2/envs/wofpy5/lib/python2.7/site-packages/wof/__init__.py", line 4, in <module>
[Wed Feb 07 17:43:36.366441 2018] [wsgi:error] [pid 76625:tid 139899760125696] [remote 100.14.198.215:33266]     from wof.core import _SERVICE_PARAMS, _TEMPLATES, site_map
[Wed Feb 07 17:43:36.366452 2018] [wsgi:error] [pid 76625:tid 139899760125696] [remote 100.14.198.215:33266]   File "/home/azureadmin/miniconda2/envs/wofpy5/lib/python2.7/site-packages/wof/core.py", line 18, in <module>
[Wed Feb 07 17:43:36.366480 2018] [wsgi:error] [pid 76625:tid 139899760125696] [remote 100.14.198.215:33266]     from spyne.application import Application
[Wed Feb 07 17:43:36.366491 2018] [wsgi:error] [pid 76625:tid 139899760125696] [remote 100.14.198.215:33266]   File "/home/azureadmin/miniconda2/envs/wofpy5/lib/python2.7/site-packages/spyne/__init__.py", line 40, in <module>
[Wed Feb 07 17:43:36.366509 2018] [wsgi:error] [pid 76625:tid 139899760125696] [remote 100.14.198.215:33266]     from spyne.decorator import rpc
[Wed Feb 07 17:43:36.366518 2018] [wsgi:error] [pid 76625:tid 139899760125696] [remote 100.14.198.215:33266]   File "/home/azureadmin/miniconda2/envs/wofpy5/lib/python2.7/site-packages/spyne/decorator.py", line 43, in <module>
[Wed Feb 07 17:43:36.366535 2018] [wsgi:error] [pid 76625:tid 139899760125696] [remote 100.14.198.215:33266]     from spyne.model import ModelBase, ComplexModel
[Wed Feb 07 17:43:36.366544 2018] [wsgi:error] [pid 76625:tid 139899760125696] [remote 100.14.198.215:33266]   File "/home/azureadmin/miniconda2/envs/wofpy5/lib/python2.7/site-packages/spyne/model/__init__.py", line 62, in <module>
[Wed Feb 07 17:43:36.366561 2018] [wsgi:error] [pid 76625:tid 139899760125696] [remote 100.14.198.215:33266]     from spyne.model.binary import File
[Wed Feb 07 17:43:36.366571 2018] [wsgi:error] [pid 76625:tid 139899760125696] [remote 100.14.198.215:33266]   File "/home/azureadmin/miniconda2/envs/wofpy5/lib/python2.7/site-packages/spyne/model/binary.py", line 29, in <module>
[Wed Feb 07 17:43:36.366679 2018] [wsgi:error] [pid 76625:tid 139899760125696] [remote 100.14.198.215:33266]     from mmap import mmap, ACCESS_READ
[Wed Feb 07 17:43:36.366702 2018] [wsgi:error] [pid 76625:tid 139899760125696] [remote 100.14.198.215:33266] ImportError: /home/azureadmin/miniconda2/envs/wofpy5/lib/python2.7/lib-dynload/mmap.so: undefined symbol: _PySlice_Unpack
lsetiawan commented 6 years ago

@miguelcleon hmm... that's new. I'm not sure.