ioos / pyoos

A Python library for collecting Met/Ocean observations
GNU Lesser General Public License v3.0
34 stars 33 forks source link

HADS enhancements #39

Closed daf closed 9 years ago

daf commented 9 years ago

This is @emiliom's changes for HADS in ioos/pyoos#37, rebased on latest master to pick up TravisCI fixes and omit non relevant commits.

daf commented 9 years ago

@emiliom - Ignoring the NERRS failures which are non-related, there's a HADS test failure:

_______________________ HadsParserTest.test__parse_data ________________________
self = <test_hads.HadsParserTest testMethod=test__parse_data>
    def test__parse_data(self):
        res = {u'CE4D0268': [(u'HM', datetime.datetime(2013, 7, 26, 16, 30), u'4.94'),
                             (u'HM', datetime.datetime(2013, 7, 26, 17, 0), u'4.41'),
                             (u'PA', datetime.datetime(2013, 7, 26, 16, 30), u'29.93'),
                             (u'PA', datetime.datetime(2013, 7, 26, 17, 0), u'29.93'),
                             (u'TA', datetime.datetime(2013, 7, 26, 16, 30), u'66.20'),
                             (u'TA', datetime.datetime(2013, 7, 26, 17, 0), u'67.30'),
                             (u'US', datetime.datetime(2013, 7, 26, 16, 30), u'5.00'),
                             (u'US', datetime.datetime(2013, 7, 26, 17, 0), u'8.00'),
                             (u'UD', datetime.datetime(2013, 7, 26, 16, 30), u'358.00'),
                             (u'UD', datetime.datetime(2013, 7, 26, 17, 0), u'353.00')],
               u'DD182264': [(u'HG', datetime.datetime(2013, 7, 26, 16, 30), u'3.07'),
                             (u'HG', datetime.datetime(2013, 7, 26, 16, 45), u'3.07')]}

        parsed = self.hp._parse_data(self.raw_data, None, (None, None))
>       assert parsed == res
E       TypeError: can't compare offset-naive and offset-aware datetimes
tests/parsers/test_hads.py:132: TypeError

I think if you drop a .replace(tz=pytz.utc) or similar (that was just from memory) on the end of those datetime constructions, you'll get a working test here.

emiliom commented 9 years ago

I haven't looked closely at the test, but I can say I deliberately added the .replace(tz=pytz.utc) to parsers/hads.py/_parse_data(). Otherwise, requests that used a datetime filter or set the collector's start_time property led to failure with that same TypeError: can't compare offset-naive and offset-aware datetimes, on the subsequent line in _parse_data():

if (begin_time is None or dt >= begin_time) and (end_time is None or dt <= end_time):

I concluded it was b/c the hads collector start_time and end_time "setters" always add the utc timezone to the dt values I pass, as a behavior inherited from the base pyoos Collector object (see its set_start_time() and set_end_time() methods). The change I introduced solved this problem. I guess a test that doesn't use temporal filters won't run into this problem.

emiliom commented 9 years ago

Any progress or comments on my input from two weeks ago? Thanks.

daf commented 9 years ago

Well, the weird thing is, I don't get these errors locally - when I introduce the .replace as I noted above, I start getting them. Must be a version or environmental difference between my machine and travis?

@lukecampbell / @benjwadams can either of you pull this branch down and run py.test tests/parsers/test_hads.py and see if you get this same error?

emiliom commented 9 years ago

Thanks, @daf. Would it help if I listed version numbers on my environment? BTW, I'm running my tweaked pyoos (installed via pip install from my github fork) on an Anaconda env and could share the steps (a mix of conda and pip install); though you'd be getting more than what's absolutely necessary, b/c my Anaconda env is fairly broad.

lukecampbell commented 9 years ago

Yeah, fails for me but only one test not like what travis is showing:

_______________________ HadsParserTest.test__parse_data ________________________

self = <test_hads.HadsParserTest testMethod=test__parse_data>

    def test__parse_data(self):
        res = {u'CE4D0268': [(u'HM', datetime.datetime(2013, 7, 26, 16, 30), u'4.94'),
                             (u'HM', datetime.datetime(2013, 7, 26, 17, 0), u'4.41'),
                             (u'PA', datetime.datetime(2013, 7, 26, 16, 30), u'29.93'),
                             (u'PA', datetime.datetime(2013, 7, 26, 17, 0), u'29.93'),
                             (u'TA', datetime.datetime(2013, 7, 26, 16, 30), u'66.20'),
                             (u'TA', datetime.datetime(2013, 7, 26, 17, 0), u'67.30'),
                             (u'US', datetime.datetime(2013, 7, 26, 16, 30), u'5.00'),
                             (u'US', datetime.datetime(2013, 7, 26, 17, 0), u'8.00'),
                             (u'UD', datetime.datetime(2013, 7, 26, 16, 30), u'358.00'),
                             (u'UD', datetime.datetime(2013, 7, 26, 17, 0), u'353.00')],
               u'DD182264': [(u'HG', datetime.datetime(2013, 7, 26, 16, 30), u'3.07'),
                             (u'HG', datetime.datetime(2013, 7, 26, 16, 45), u'3.07')]}

        parsed = self.hp._parse_data(self.raw_data, None, (None, None))
>       assert parsed == res
E       TypeError: can't compare offset-naive and offset-aware datetimes
daf commented 9 years ago

@lukecampbell the other travis errors can be ignored for now, nerrs changes its output all the time so trying to hit a moving target there. The error you got is the one I'm trying to track down. I don't get it locally.

If you get a chance please do some pprints or something on both parsed and res in that test - want to know which side has the datetimes with the timezone info - assume it's parsed for you. I don't understand why I don't get it. Environment/locale affects datetime lib somehow?

lukecampbell commented 9 years ago

https://gist.github.com/lukecampbell/cc23a31df0a7170ad7d9

Looks like a timezone issue, recommend stripping off timezones for the comparisons, or forcing all timezones to UTC.

daf commented 9 years ago

The goal is always UTC. I'm going to dig into my machine and figure out why my parsed has datetimes with no timezone.

daf commented 9 years ago

Ok, it's a conda issue - i started using conda locally instead of virtualenv, and I don't understand the ins and outs - it was using a conda env installed pyoos instead of the local one, and now that i've removed it, it refuses to find pyoos. Have some more learning to do with how conda works.

For now I'll just commit these changes and see how Travis does on them.

lukecampbell commented 9 years ago

Could you do me the kindness of documenting the differences between virtualenv and conda?

We use them both nowadays and troubleshooting these differences is always a huge time-suck and costs lots of money.

emiliom commented 9 years ago

Not sure if my input is useful here. I'm getting confused about what is failing where (Travis vs @daf's computer, conda vs other environments, etc). Let me know if I can clarify anything. Like I mentioned two weeks ago, the pyoos hads test you're dealing with doesn't exercise the pyoos datetime filter or set the collector's start_time; trying to get that kind of datetime filtering (which should be a very common usage scenario) is what led me to the problem that I've addressed. My fix works fine for me, using pyoos datetime filters, on two different computers, both on Ubuntu (12.04 and 14.04) and both using a custom conda environment I've built, where pyoos was installed from my fork via pip install git+https://github.com/emiliom/pyoos.git. I'm using my hads collector modification in an operational process that pulls in the latest data interval from one site once an hour. Thanks!

daf commented 9 years ago

@emiliom The issue with environments turned out to be that my tests were running against an installed version of pyoos instead of my local dev copy - not familiar enough with conda to understand why it did that.

You changed two pieces of behavior for HADS - some values are now floats, and the datetime bit - but you didn't update the tests to match so they fail. I have to see if the float tests can be done with numpy's helper methods, I'm not sure they can traverse dicts like we are doing. The timezone change is trivial and I've done it already.

@lukecampbell as soon as I figure it out, I'll let you know too. conda so far has been pretty useful for science deps as it's crazy fast, but I don't understand all the wrinkles.